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(57) Abstract 



The present invention relates to 186 novel human secreted proteins and isolated nucleic acids containing the coding regions of the 
genes encoding such proteins. Also provided are vectors, host cells, antibodies, and recombinant methods for producing human secreted 
proteins. The invention further relates to diagnostic and therapeutic methods useful for diagnosing and treating disorders related to these 
novel human secreted proteins. 
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186 Human Secreted Proteins 

Field of the Invention 

This invention relates to newly identified polynucleotides and the polypeptides 
encoded by these polynucleotides, uses of such polynucleotides and polypeptides, and 
5 their production. 

Background of the Invention 

Unlike bacterium, which exist as a single compartment surrounded by a 
membrane, human cells and other eucaryotes are subdivided by membranes into many 
functionally distinct compartments. Each membrane-bounded compartment, or 

10 organelle, contains different proteins essential for the function of the organelle. The cell 
uses "sorting signals," which are amino acid motifs located within the protein, to target 
proteins to particular cellular organelles. 

One type of sorting signal, called a signal sequence, a signal peptide, or a leader 
sequence, directs a class of proteins to an organelle called the endoplasmic reticulum 

1 5 (ER). The ER separates the membrane-bounded proteins from all other types of 

proteins. Once localized to the ER, both groups of proteins can be further directed to 
another organelle called the Golgi apparatus. Here, the Golgi distributes the proteins to 
vesicles, including secretory vesicles, the cell membrane, lysosomes, and the other 
organelles. 

20 Proteins targeted to the ER by a signal sequence can be released into the 

extracellular space as a secreted protein. For example, vesicles containing secreted 
proteins can fuse with the cell membrane and release their contents into the extracellular 
space - a process called exocytosis. Exocytosis can occur constitutively or after receipt 
of a triggering signal. In the latter case, the proteins are stored in secretory vesicles (or 

25 secretory granules) until exocytosis is triggered. Similarly, proteins residing on the cell 
membrane can also be secreted into the extracellular space by proteolytic cleavage of a 
"linker" holding the protein to the membrane. 

Despite the great progress made in recent years, only a small number of genes 
encoding human secreted proteins have been identified. These secreted proteins include 

30 the commercially valuable human insulin, interferon, Factor VIII, human growth 
hormone, tissue plasminogen activator, and erythropoeitin. Thus, in light of the 
pervasive role of secreted proteins in human physiology, a need exists for identifying 
and characterizing novel human secreted proteins and the genes that encode them. This 
knowledge will allow one to detect, to treat, and to prevent medical disorders by using 

35 secreted proteins or the genes that encode them. 
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Summary of the Invention 

The present invention relates to novel polynucleotides and the encoded 
polypeptides. Moreover, the present invention relates to vectors, host cells, antibodies, 
5 and recombinant methods for producing the polypeptides and polynucleotides. Also 
provided are diagnostic methods for detecting disorders related to the polypeptides, and 
therapeutic methods for treating such disorders. The invention further relates to 
screening methods for identifying binding partners of the polypeptides. 

10 Detailed Description 

Definitions 

The following definitions are provided to facilitate understanding of certain 
terms used throughout this specification. 

In the present invention, "isolated" refers to material removed from its original 

15 environment (e.g., the natural environment if it is naturally occurring), and thus is 
altered "by the hand of man" from its natural state. For example, an isolated 
polynucleotide could be part of a vector or a composition of matter, or could be 
contained within a cell, and still be "isolated" because that vector, composition of 
matter, or particular cell is not the original environment of the polynucleotide. 

20 In the present invention, a "secreted" protein refers to those proteins capable of 

being directed to the ER, secretory vesicles, or the extracellular space as a result of a 
signal sequence, as well as those proteins released into the extracellular space without 
necessarily containing a signal sequence. If the secreted protein is released into the 
extracellular space, the secreted protein can undergo extracellular processing to produce 

25 a "mature" protein. Release into the extracellular space can occur by many 
mechanisms, including exocytosis and proteolytic cleavage. 

As used herein , a "polynucleotide" refers to a molecule having a nucleic acid 
sequence contained in SEQ ID NO:X or the cDNA contained within the clone deposited 
with the ATCC. For example, the polynucleotide can contain the nucleotide sequence 

30 of the full length cDN A sequence, including the 5' and 3' untranslated sequences, the 
coding region, with or without the signal sequence, the secreted protein coding region, 
as well as fragments, epitopes, domains, and variants of the nucleic acid sequence. 
Moreover, as used herein, a "polypeptide" refers to a molecule having the translated 
amino acid sequence generated from the polynucleotide as broadly defined. 

35 In the present invention, the full length sequence identified as SEQ ID NO:X 

was often generated by overlapping sequences contained in multiple clones (contig 
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analysis). A representative clone containing all or most of the sequence for SEQ ID 
NO:X was deposited with the American Type Culture Collection ("ATCC"). As 
shown in Table 1, each clone is identified by a cDNA Clone ID (Identifier) and the 
ATCC Deposit Number. The ATCC is located at 12301 Park Lawn Drive, Rockville, 
5 Maryland 20852, USA. The ATCC deposit was made pursuant to the terms of the 
Budapest Treaty on the international recognition of the deposit of microorganisms for 
purposes of patent procedure. 

A "polynucleotide" of the present invention also includes those polynucleotides 
capable of hybridizing, under stringent hybridization conditions, to sequences contained 
10 in SEQ ID NO:X, the complement thereof, or the cDNA contained within the clone 
deposited with the ATCC. "Stringent hybridization conditions" refers to an overnight 

incubation at 42° C in a solution comprising 50% formamide, 5x SSC (750 mM NaCl, 

75 mM sodium citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 
10% dextran sulfate, and 20 |ig/ml denatured, sheared salmon sperm DNA, followed 

15 by washing the filters in 0. lx SSC at about 65°C. 

Also contemplated are nucleic acid molecules that hybridize to the 
polynucleotides of the present invention at lower stringency hybridization conditions. 
Changes in the stringency of hybridization and signal detection are primarily 
accomplished through the manipulation of formamide concentration (lower percentages 
20 of formamide result in lowered stringency); salt conditions, or temperature. For 

example, lower stringency conditions include an overnight incubation at 37°C in a 

solution comprising 6X SSPE (20X SSPE = 3M NaCl; 0.2M NaH 2 P0 4 ; 0.02M EDTA, 
pH 7.4), 0.5% SDS, 30% formamide, 100 ug/ml salmon sperm blocking DNA; 

followed by washes at 50°C with 1XSSPE, 0.1% SDS. In addition, to achieve even 

25 lower stringency, washes performed following stringent hybridization can be done at 
higher salt concentrations (e.g. 5X SSC). 

Note that variations in the above conditions may be accomplished through the 
inclusion and/or substitution of alternate blocking reagents used to suppress 
background in hybridization experiments. Typical blocking reagents include 

30 Denhardt's reagent, BLOTTO, heparin, denatured salmon sperm DNA, and 

commercially available proprietary formulations. The inclusion of specific blocking 
reagents may require modification of the hybridization conditions described above, due 
to problems with compatibility. 

Of course, a polynucleotide which hybridizes only to polyA+ sequences (such 

35 as any 3' terminal polyA+ tract of a cDNA shown in the sequence listing), or to a 
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complementary stretch of T (or U) residues, would not be included in the definition of 
"polynucleotide/' since such a polynucleotide would hybridize to any nucleic acid 
molecule containing a poly (A) stretch or the complement thereof (e.g., practically any 
double-stranded cDNA clone). 
5 The polynucleotide of the present invention can be composed of any 

polyribonucleotide or polydeoxribonucleotide, which may be unmodified RNA or DNA 
or modified RNA or DNA. For example, polynucleotides can be composed of single- 
and double-stranded DNA, DNA that is a mixture of single- and double-stranded 
regions, single- and double-stranded RNA, and RNA that is mixture of single- and 

10 double-stranded regions, hybrid molecules comprising DNA and RNA that may be 

single-stranded or, more typically, double-stranded or a mixture of single- and double- 
stranded regions. In addition, the polynucleotide can be composed of triple-stranded 
regions comprising RNA or DNA or both RNA and DNA. A polynucleotide may also 
contain one or more modified bases or DNA or RNA backbones modified for stability 

15 or for other reasons. "Modified" bases include, for example, tritylated bases and 
unusual bases such as inosine. A variety of modifications can be made to DNA and 
RNA; thus, "polynucleotide" embraces chemically, enzymatically, or metabolically 
modified forms. 

The polypeptide of the present invention can be composed of amino acids joined 

20 to each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and 
may contain amino acids other than the 20 gene-encoded amino acids. The 
polypeptides may be modified by either natural processes, such as posttranslational 
processing, or by chemical modification techniques which are well known in the art. 
Such modifications are well described in basic texts and in more detailed monographs, 

25 as well as in a voluminous research literature. Modifications can occur anywhere in a 
polypeptide, including the peptide backbone, the amino acid side-chains and the amino 
or carboxyl termini. It will be appreciated that the same type of modification may be 
present in the same or varying degrees at several sites in a given polypeptide. Also, a 
given polypeptide may contain many types of modifications. Polypeptides may be 

30 branched , for example, as a result of ubiquitination, and they may be cyclic, with or 
without branching. Cyclic, branched, and branched cyclic polypeptides may result 
from posttranslation natural processes or may be made by synthetic methods- 
Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent 
attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a 

35 nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, 
covalent attachment of phosphotidylinositol, cross-linking, cyclization, disulfide bond 
formation, demethylation, formation of covalent cross-links, formation of cysteine, 
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formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI 
anchor formation, hydroxylation, iodination, methylation, myristoylation, oxidation, 
pegylation, proteolytic processing, phosphorylation, prenylation, racemization, 
selenoylation, sulfation, transfer-RNA mediated addition of amino acids to proteins 

5 such as arginylation, and ubiquitination. (See, for instance, PROTEINS - 

STRUCTURE AND MOLECULAR PROPERTIES, 2nd Ed., T. E. Creighton, W. 
H. Freeman and Company, New York (1993); POSTTRANSLATIONAL 
COVALENT MODIFICATION OF PROTEINS, B. C. Johnson, Ed., Academic 
Press, New York, pgs. 1-12 (1983); Seifter et al., Meth Enzymol 182:626-646 (1990); 

10 Rattan et al., Ann NY Acad Sci 663:48-62 (1992).) 

"SEQ ID NO:X" refers to a polynucleotide sequence while "SEQ ID NO:Y" 
refers to a polypeptide sequence, both sequences identified by an integer specified in 
Table 1. 

"A polypeptide having biological activity" refers to polypeptides exhibiting 
1 5 activity similar, but not necessarily identical to, an activity of a polypeptide of the 

present invention, including mature forms, as measured in a particular biological assay, 
with or without dose dependency. In the case where dose dependency does exist, it 
need not be identical to that of the polypeptide, but rather substantially similar to the 
dose-dependence in a given activity as compared to the polypeptide of the present 
20 invention (i.e., the candidate polypeptide will exhibit greater activity or not more than 
about 25-fold less and, preferably, not more than about tenfold less activity, and most 
preferably, not more than about three-fold less activity relative to the polypeptide of the 
present invention.) 

25 Polynucleotides and Polypeptides of the Invention 

FEATURES OF PROTEIN ENCODED BY GENE NO: 1 

This gene is expressed primarily in testes tumor and to a lesser extent in fetal 

brain. 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer particularly of the testes, and defects of the central nervous system 
such as seizure and neurodegenerative disorders. Similarly, polypeptides and antibodies 

35 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly cancer of the testes and central nervous system, 
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expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., testes and other reproductive tissue, brain and 
other tissue of the nervous system, and blood cells, and spleen, and cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
5 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis of testicular cancer and 
10 treatment of central nervous system disorders since this gene is primarily expressed in 
the testes tumor and developing brain. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 2 

This gene is expressed primarily in cancer tissues, such as breast cancer and 

15 Wilm's tumor, and to a lesser extent in fetal tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, and/or tumors, particularly, those found in the breast, and developmental 

20 abnormalities or disorders. Similarly, polypeptides and antibodies directed to these 

polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the glandular tissues, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., mammary 

25 tissue, and fetal tissue and, cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 

30 sequence shown in SEQ ID NO. 314 as residues: Pro-1 1 to Thr-18, Leu-43 to Pro-50, 
Gly-64 to Leu-72, and Leu-8 1 to Lys-86. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis of cancers and/or tumors, 
particularly, those found in the breast since expression is mainly in cancer/tumor 

35 tissues. May serve as therapeutic proteins for proliferation/differentiation of fetal 
tissues. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 3 

This gene is expressed primarily in CD34 depleted buffy coat and to a lesser 
extent in spleen, chronic lymphocytic leukemia. 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: blood disorders or 
leukemias, diseases of the immune system. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 

0 differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., blood cells, and spleen, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 

5 cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis of blood disorders or 

0 leukemias, diseases of the immune system since expression is in tissues related to 
immune function. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 4 

This gene is expressed primarily in CD34 depleted buffy coat. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions; blood disorders or 
lymphocytic diseases. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., blood 
cells, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis of blood disorders since 
expression is in tissues related to immune function. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 5 

This gene is expressed primarily in CD34 depleted buffy coat. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: blood or immune 

10 diseases. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues and cell types (e.g., blood cells, and cancerous 

15 and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or 
spinal fluid) or another tissue or cell sample taken from an individual having such a 
disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO. 317 as residues: 

20 Pro- 13 to Lys-21. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis of blood disorders since 
expression is in tissues related to immune function. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 6 

This gene is expressed primarily in CD34 depleted buffy coat. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: blood or immune 

30 diseases. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues and cell types (e.g., and blood cells, and 

35 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
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in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO. 3 18 as residues: 
Lys-31 to Lys-39. 

The tissue distribution indicates that polynucleotides and polypeptides 
5 corresponding to this gene are useful for treatment/diagnosis of blood diseases since it 
is expressed in tissues related to immune function. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 7 

This gene is expressed primarily in CD34 depleted buffy coat and to a lesser 
extent in pineal gland. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: diseases of the immune 
system and brain associated diseases. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., blood cells, and pineal gland, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis of blood disorders, 
immune diseases or brain associated diseases (specifically of the pineal gland) since 
expression is in tissues related to immune function. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 8 

The translation product of this gene shares sequence homology with an organic 
cation transporter which is thought to be important in organic cation uptake in the 
kidney and liver. (See Accession No. 2343059.) Preferred polypeptide fragments 
comprise the amino acid sequence mAiQMiCLVNXELYPTFVRNXGVMVCSSLCDlGGliTP 

HVFRLREVWQALPLILFAVLGLLAA 

VQTSEPSGT (SEQ ID NO: 615) or TMKDAENLGRKAKPKENT (SEQ ID NO: 616) as well 
as N-terminal and C-terminal deletions of these fragments. Also preferred are 
polynucleotide fragments encoding these polypeptide fragments. 
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This gene is expressed primarily in liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: hepatic and renal 

5 diseases where drug elimination/cation exchange (organic cation uptake) in the liver and 
kidney are problematic. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the hepatic or renal system, expression of this gene at significantly higher 

10 or lower levels may be routinely detected in certain tissues and cell types (e.g., kidney 
and liver, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

1 5 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO. 320 as residues: Asn-64 to Asn-74, and Gln-81 to Gly-87. 

The tissue distribution and homology to organic cation transporter indicate that 
polynucleotides and polypeptides corresponding to this gene are useful as a polyspecific 
transporter that is important for drug elimination in the liver (and possibly kidney) since 

20 expression is found in the liver. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 9 

This gene is expressed primarily in eosinophil induced with IL-5 and to a lesser 
extent in fetal liver and spleen. This gene also maps to chromosome 15, and therefore 

25 can be used in linkage analysis as a marker for chromosome 15. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: diseases of the immune 
system, particularly allergies or asthma. Similarly, polypeptides and antibodies directed 

30 to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., blood cells, liver, and spleen, and cancerous and wounded tissues) or 

35 bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
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standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating/diagnosis of diseases involving 
5 esosinphil reactions since expression seems to be concentrated in eosinophils and other 
tissues involved in immunity such as the liver and spleen. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 10 

This gene is expressed primarily in tissues of hematopoietic lineage and to a 

10 lesser extent in Hodgkins lymphoma. Any frame shifts in this sequence can easily be 
clarified using known molecular biology techniques. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

1 5 not limited to, and immune deficiency or dysfunction. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues and 

20 cell types (e.g. t hematopoietic cells, lymphoid and reticuloendothelial tissues, and 

cancerous tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

25 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment/ diagnosis for lymphomas or 
immune dysfuction or as a therapeutic protein useful in immune modulation based on 
expression in anergic T-cells and lymphomas. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 11 

This gene is expressed primarily in neutrophils and to a lesser extent in activated 
lymphoid cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the cell type present in a biological sample and 
35 for diagnosis of diseases and conditions: inflamation. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
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of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues and 
cell types (e.g., blood cells and lymphoid tissue, and cancerous and wounded tissues) 
or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
5 tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO. 323 as residues: Glu-40 to Lys-46. 
The tissue distribution indicates that polynucleotides and polypeptides 
10 corresponding to this gene are useful for modulation of an immune reaction or as a 
growth factor for the differentiation or proliferation of neutrphils for the treatment of 
neutropenia, 

FEATURES OF PROTEIN ENCODED BY GENE NO: 12 

1 5 This gene is expressed primarily in brain and to a lesser extent in activated T- 

cells. It is likely that the open reading frame containing the predicted signal peptide 
continues in the 5' direction. Preferred polypeptide fragments comprise the amino acid 
sequence PRVRNSPEDLGLSLTGDSCKL (SEQ ID NO:617). 

Therefore, polynucleotides and polypeptides of the invention are useful as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: neurodegenerative 
disorders including ischemic shock, alzheimers and cognitive disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 

25 number of disorders of the above tissues or cells, particularly of the central nervous 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues and cell types (e.g., blood cells, and brain, and other tissue of 
the nervous system and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 

30 an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO. 324 as residues: Ser-5 to Glu-14, Ile-21 to Pn>35, Ser-65 to Asp-81, Cys-89 to 
VaI-96, Lys-136 to Ser-145, Ile-152 to Met-169, and Arg-189 to Lys-196. 

35 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnostic/treatment for cancers of the given 
tissue or in the treatment of neurological disorders of the CNS. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 13 

This gene was also recently cloned by other groups, naming this calcium-activated 
potassium channel gene, hKCa4. (See Accession No. AF033021, see also, Accession 
5 No. 2584866.) This gene is mapped to human chromosome 19ql3.2. A second signal 
sequence likely exists upstream from the predicted signal sequence as described in 
Table 1. Preferred polypeptide fragments comprise: QADDLQATVAALCVLRGGGPWAG 
SWLSPKTPGAMGGDLVLGLGALRRRKRLL (SEQ NO: 618); or EQEKSLAGWALVLAXXGIGL 
MVLHAEMLWFGGCSAVNATGHI^DTL\\TJPI^ 
1 0 TALLVAVVARKLEFNKAEKHVHNFMMDIQYTKEMKES AARVLQEAWMFYKHTRRKESHAAR 
XHQRXLLAAINAITIQVRLKHRKLREQVNS 

KLDALTELLSTALGPRQLPEPSQQSK (SEQ ID NO: 619), as well as N-terminal and C- 
terminal deletions. Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 

1 5 This gene is expressed primarily in breast lymph node and T-cells, and to a 

lesser extent in placenta. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: hematologic and 

20 immune disorders. Similarly, polypeptides and antibodies directed to these polypeptides 
are useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., lymphoid 

25 tissue, blood cells and placenta, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 

30 sequence shown in SEQ ID NO. 325 as residues: Arg-13 to Lys-23. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment/diagnosis of hematologic and 
diseases involving immune modulation based or distribution in the lymph node and T- 
cells. 



35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 14 

This gene was recently cloned by another group, calling it PAPS synethase. 
(See Accession No. el 204 135.) Preferred polypeptide fragments comprise the amino 
acid sequence YQAHHVSRNKRGQVVGTRGGFRGCTVWLTGLSGAGK (SEQ ID NO: 620). 
5 Also preferred are the polynucleotide fragments encoding this polypeptide fragment. 

It has been discovered that this gene is expressed primarily in benign prostate 
hyperplasia, Human Umbilical Vein Endothelial Cells and to a lesser extent in smooth 
muscle and Human endometrial stromal cells-treated with estradiol. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: inflamation, ischemia, 
and restenosis, based on endothelial cell and smooth muscle cell expression, and 
prostate diseases such as benign prostate hyperplasia or prostate cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

15 immunological probes for differentia] identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the prostate or vessels 
of the circulatory system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues and cell types (e.g., prostate, endothelial 
cells, smooth muscle, and endometrium, and cancerous and wounded tissues) or bodily 

20 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO. 326 as residues: Arg-21 to Asp-26, Lys-35 to Lys-44, 

25 Glu-49 to Asn-58. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating/diagnosing diseases or conditions 
where the endothelial cell lining of the veins and arteries of underlying smooth muscle 
are involved. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 15 

This gene is expressed primarily in human 6 week embryo and to a lesser extent 
in placenta. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: developmental 
anomalies or fetal deficiencies. Similarly, polypeptides and antibodies directed to these 
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polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly developmental in nature, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., embryonic 
5 tissue, and placenta, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
10 NO. 327 as residues Lys-50 to Glu-57. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detection of developmental abnormalities. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 16 

15 This gene is expressed primarily in kidney and amygdala and to a lesser extent 

in fetal tissues. This gene is mapped to chromosome 14, and therefore is useful in 
linkage analysis as a marker for chromosome 14. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) present in a biological sample and 

20 for diagnosis of diseases and conditions: kidney diseases, neurological disorders and 
developmental abnormalities. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s). For a number of disorders of the above tissues, particularly of the renal 
system or developing fetal tissues, expression of this gene at significantly higher or 

25 lower levels may be routinely detected in certain tissues and cell types (e.g., kidney, 
amygdala, and fetal tissues, and cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

30 individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment or diagnosis of conditions affecting 
the brain, kidneys and fetal development. 

35 FEATURES OF PROTEIN ENCODED BY GENE NO: 17 

This gene is expressed primarily in ovarian cancer. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: solid tumors similar to 
ovarian cancer Similarly, polypeptides and antibodies directed to these polypeptides are 
5 useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the reproductive system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues and cell types (e.g., ovarian and other 
reproductive tissue, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO. 329 as residues Ser-51 to Val-56. 

15 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the treatment of solid tumors of the 
reproductive system such as ovarian cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 18 

This gene is expressed primarily in brain medulloblastoma. Preferred polypeptide 
fragments comprise the amino acid sequence: IRHEQHPNFSLEMHSKGSSLLLFLPQL 
ILELPVCAHLHEELNC (SEQ ID NO: 643) and SFFISEEKGHLLLQAERHPWVAGALVGVSG 
GLTLTTCSGPTEKPATKNYFLKRLLQEMHIRAN (SEQ ID NO: 644), as well as N-terminal 
and C-terminal deletions. Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tumors particularly of 
the CNS or Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 

25 or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the Central nervous system, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues and cell types (e.g., brain and other 
tissue of the nervous system, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 

30 sample taken from an individual having such a disorder, relative to the standard gene 
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expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating medulloblastoma or similar tumors. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 19 

This gene is expressed primarily in adipocytes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

10 biological sample and for diagnosis of diseases and conditions: obesity. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the adipose tissues 
expression of this gene at significantly higher or lower levels may be routinely detected 

15 in certain tissues and cell types (e.g., adipocytes and cancerous and wounded tissues) 
or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

20 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treating obesity by regulating the function and 
number of adipocytes 

FEATURES OF PROTEIN ENCODED BY GENE NO: 20 

25 This gene is expressed primarily in B cell lymphoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, of the immune system with an emphasis on B cell lymphoma. Similarly, 

30 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the tumors of the 
immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., blood cells, and lymphoid 

35 tissue, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of B cell derived 
5 tumors based on its expression in b cell lymphomas 

FEATURES OF PROTEIN ENCODED BY GENE NO: 21 

This gene is expressed primarily in immune cells and to a lesser extent in fetal 

tissues 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: inflammatory diseases 
Similarly, polypeptides and antibodies directed to these polypeptides are usefiil in 
providing immunological probes for differential identification of the tissue(s) or cell 

15 type(s). For a number of disorders of the above tissues or cells, particularly of the 

immune expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues and cell types (e.g., cells of the immune system, and fetal 
tissues, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

20 individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO:333 as residues Asp-10 to Pro-19, Ser-74 to Tyr-79, Glu-95 to Lys-1 10. 
The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for treatment of diseases involving alterations in T 
cell activity. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 22 

It has been discovered that this gene is expressed primarily in ovarian tumor. 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tumors particularly of 
the ovary. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 

35 or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
tumors of the reproductive organs, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., ovarian 
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and other reproductive tissue and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
5 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO. 334 as residues: Leu-22 to Gln-27. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of ovarian tumors as it 
has only been identified in ovarian tumors. 

10 

FEATURES OF PROTEIN ENCODED BY GENE NO: 23 

It has been discovered that this gene is expressed primarily in fetal tissues and to 
a lesser extent in osteoclastoma cell line 

Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: osteoporosis or arthritis 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 

20 skeletal expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues and cell types (e.g., bone cells, and fetal tissue, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

25 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment of conditions of abnormal bone 
remodeling due to enhanced activity of osteoclasts. This may be useful as a specific 
marker for malignancies derived from osteoclasts or their precursors. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 24 

The translation product of this gene shares sequence homology with a 
periplasmic ribonucle&se which is thought to be important in degrading extracellular 
polynucleotides 

35 It has been discovered that this gene is expressed primarily in serum treated 

smooth muscle cells 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: vascular disease such as 
restenosis. Similarly, polypeptides and antibodies directed to these polypeptides are 
5 useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the vasculature expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., smooth muscle, and cancerous 
and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or 

10 spinal fluid) or another tissue or cell sample taken from an individual having such a 
disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 336 as residues: 
Gln-30 to Lys-36, and Pro-41 to Arg-48. 

15 The tissue distribution and homology to ribonucleases indicate that 

polynucleotides and polypeptides corresponding to this gene are useful for treatment of 
pathological conditions of smooth muscle associated with bacterial or viral infiltration 

FEATURES OF PROTEIN ENCODED BY GENE NO: 25 

20 This gene is expressed primarily in Early Stage Human Brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: human brain 
development and related diseases. Similarly, polypeptides and antibodies directed to 

25 these polypeptides are useful in providing immunological probes for differential 

identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the human brain development and related diseases, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., brain and other tissue of the nervous system, and 

30 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to this gene indicate that polynucleotides 

35 and polypeptides corresponding to this gene are useful for diagnosis and treatment of 
diseases affecting human brain development and related diseases. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 26 

It has been discovered that this gene is expressed primarily in human brain 

tissue. 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: human brain diseases 
and other diseases related to brain diseases, which may be caused by brain diseases. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

10 providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
human brain diseases, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues and cell types (e.g., brain and other tissue of 
the nervous system, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 

15 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to the gene indicate that polynucleotides 
20 and polypeptides corresponding to this gene are useful for diagnosis and treatment of 
human brain diseases and other diseases related. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 27 

It has been discovered that this gene is expressed primarily in Anergic T-cells. 

25 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: immune diseases, 
inflammatory diseases and diseases related to T lymph cells. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 

30 probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune diseases, 
inflammatory diseases and diseases related to T lymph cells, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., blood cells, and cancerous and wounded tissues) or bodily fluids (e.g., 

35 serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
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expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to the gene indicate that polynucleotides 
and polypeptides corresponding to this gene are useful for immune diseases, 
5 inflammatory diseases and diseases related to T lymph cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 28 

The translation product of this gene shares sequence homology with Shigella 
flexneri positive transcriptional regulator CriR (criR) gene which is thought to be 
0 important in regulation of gene expression. 

This gene is expressed primarily in human synovial sarcoma and normal human 
brain tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

5 biological sample and for diagnosis of diseases and conditions: human brain diseases 
particularly sarcomas of the synovium. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the human brain and synovium and other related human 

0 brain diseases, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain (e.g., synovial tissue, and brain and other tissue of the 
nervous system, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

5 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of human synovial 
sarcoma and other related human brain diseases. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 29 

This gene is expressed in bone marrow, infant brain, fetal liver and spleen, 
prostate and to a lesser extent in pineal gland, adipose tissue, kidney, adrenal gland, 
umbilical vein endothelial cells, and T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for identification of the tissue(s) or cell type(s) present in a biological sample 
and for diagnosis of diseases and conditions: diseases related to bone marrow or 
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hematoplastic tissues, prostate, kidney, adrenal gland, and cardiovascular tissue or 
organs. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the diseases related to 
5 hematoplastic tissues, immune system, prostate, kidney, adrenal gland, and 

cardiovascular tissue or organs, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues and cell types (e.g., bone marrow, 
hematopoietic cells, pineal gland, adipose tissue, kidney, adrenal gland, endothelial 
cells, and blood cells, and cancerous and wounded tissues) or bodily fluids (e.g., 

10 serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to the gene indicate that polynucleotides 

15 and polypeptides corresponding to this gene are useful for diagnosis and treatment of 
diseases related to hematoplastic tissues, immune system, prostate, kidney, adrenal 
gland, and cardiovascular tissue or organs. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 30 

20 This gene is expressed primarily in meningea and to a lesser extent in breast and 

adult brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: Diseases of the 

25 meningea and related brain diseases. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the meningea and related brain diseases, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 

30 tissues and cell types (e.g., miningea, mammary tissue, and brain and other tissue of 
the nervous system, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of diseases of the 
meningea and related brain diseases, 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 31 

This gene is expressed in meningea, fetal spleen, osteoblast and to a lesser 
extent in activated T-cells, endometrial stromal cells, fetal lung, HL-60, thymus, testis 
and endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: meningeal disease, 
osteoporosis, immune diseases, and hematoplastic diseases. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for identification of the tissue(s) or cell type(s). For a number of disorders of the 

15 above tissues or cells, particularly of the meningeal diseases, osteoporosis, immune 
diseases, and hematoplastic diseases, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., blood 
cells, endometrium, lung, thymus, testis, and endothelial cells, and cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 

20 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to gene indicate that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and treatment of 

25 meningeal, osteoporosis, immune diseases, hematoplastic diseases, testis diseases and 
lung diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 32 

This gene is expressed primarily in human thymus and to a much lesser extent 
30 in infant brain, T-cells, smooth muscle, endothelial cells, bone marrow, human ovarian 
tumor and keratinocytes testes, osteoclastoma, breast, and tonsils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: Diseases involving the 
35 thymus, particularly thymic cancer and diseases involving T-cell maturation. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 
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number of disorders of the above tissues or cells, particularly of the thymus, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues and cell types (e.g., thymus, brain, and other tissue of the nervous system, 
blood cells, bone marrow, ovaries, and testes, and other reproductive tissue, mammary 
5 tissue, tonsils, melanocytes and cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
10 The tissue distribution and homology to gene indicate that polynucleotides and 

polypeptides corresponding to this gene are useful for diagnosis and treatment of 
diseases of the thymus particularly thymic cancer and diseases involving T-cell 
maturation. 



15 FEATURES OF PROTEIN ENCODED BY GENE NO: 33 

This gene is expressed primarily in human tonsils, and placenta, and to a lesser 
extent in adipocytes, melanocyte, and infant brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions: inflammatory diseases, 
immune diseases, and obesity. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the inflammatory diseases, immune diseases, and obesity, expression of 

25 this gene at significantly higher or lower levels may be routinely detected in certain 
tissues and cell types (e.g., tonsils, placenta, adipocytes, melanocytes, and brain and 
other tissue of the nervous system, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

30 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to this gene indicate that polynucleotides 
and polypeptides corresponding to this gene are useful for diagnosis and treatment of 
diseases such as inflammation, immune diseases, and obesity. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 34 

This gene is expressed in activated T cells, and to a lesser extent in pituitary, 
testis, and breast lymph node. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissuc(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: diseases relating to T 
cells. Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
10 disorders of the immune system, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues and cell types (e.g., pituitary, testes 
and other reproductive tissue, mammary tissue, and lymphoid tissue, and cancerous 
and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or 
spinal fluid) or another tissue or cell sample taken from an individual having such a 
15 disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment of immune disorders. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 35 

This gene is expressed primarily in infant brain. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: neurological disorders. 

25 Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
diseases relating to neurological disorders, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues and cell types (e.g., 

30 brain, and other tissue of the nervous system, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

35 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment of neurological 
disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 36 
This gene is expressed primarily in infant brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: neurological disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 

10 diseases relating to neurological disorders, expression of this gene at significantly 

higher or lower levels may be routinely detected in certain tissues and cell types (e.g., 
brain and other tissue of the nervous system, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 

15 standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neurological 
disorders. 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 37 

This gene is expressed primarily in human ovary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions: ovarian cancer. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
ovarian disorders such as those involving germ cells, ovarian follicles, stromal cells, 

30 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., ovary and other reproductive tissue, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

35 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of ovariopathy. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 38 
This gene is expressed primarily in lymph node breast cancer. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differentia] identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: breast cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the breast cancer, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., mammary tissue and lymphoid tissue, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for used as a diagnostic marker for breast cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 39 

This gene is expressed primarily in brain and to a lesser extent in other tissues. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: neuronal disorders such 
as trauma, brain degeneration, and brain tumor. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the brain, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., brain and other tissue of the nervous system, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and therapeutic treatment of 
neuronal disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 40 

This gene is expressed in early stage human embryo, adrenal gland tumor, and 
5 immune tissues such as fetal liver, fetal spleen, T-cell, and myoloid progenitor cell line 
and to a lesser extent in ovary, colon cancer, and a few orther tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tumorigenesis including 

10 adrenal gland tumor, colon cancer and various other tumors, developmental and 

immune disorders. Similarly, polypeptides and antibodies directed to these polypeptides 
are useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the cancer tissues, early stage human tissues, and immune system, 

1 5 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., liver, spleen, blood cells, bone marrow, ovary 
and other reproductive tissue, and colon, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

20 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and therapeutic treatment of immune 
and developmental disorders, and tumorigenesis. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 41 

This gene is expressed primarily in fetal lung, endothelial cells, liver, thymus 
and a few other immune tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
30 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: immune disorders such 
as immune deficiency and autoimmune diseases, pulmonary diseases, liver diseases, 
and tumor matasis. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
35 of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the fetal lung, liver, endothelial cells, and immune tissues, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
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tissues and cell types (e.g., lung, endothelial cells, liver, thymus, and other tissue of 
the immune system, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
5 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis of immune disorders and pulmonary 
and hepatic diseases. Its promoter may also be used for immune system and lung- 
10 specific gene therapies. The expression of this gene in endothelial cells indicates that it 
may also involve in angiogenesis which therefore may play role in tumor matasis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 42 

This gene is expressed primarily in liver, thyroid, parathyroid and to a lesser 

1 5 extent in fetal lung, stomach and early embryos. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: metabolic regulation, 
obesity, heptic failure, heptacellular tumors or thyroiditis and thyroid tumors. Similarly, 

20 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the digestive/endocrine 
system expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues and cell types (e.g., liver, thyroid, parathyroid, lung, 

25 stomach, and embryonic tissue, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

30 The tissue distribution and the extracellular locations indicates that 

polynucleotides and polypeptides corresponding to this gene are useful for the detection 
and treatment of digestive/endocrine disorders, including metabolic regulation, heptic 
failure, malabsortion, gastritis and neoplasms. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 43 

This gene is expressed primarily in Schizophrenic adult brain, pituitary, front 
cortex, hypothalmus and to a lesser extent in retina, adipose and stomach cancer and 
placenta. 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: schizophrenia and other 
neurological disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 

0 of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the central nerve system, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues and cell types (e.g., retinal 
tissue, adipose, stomach, and placenta, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 

5 cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful in treatment/detection of disorders in the nerve 

D system, including schizophrenia, neurodegeneration, and neoplasia. Additionally, a 
secreted protein in brain may serve as an endocrine. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 44 

The translation product of this gene shares sequence homology with GTP 
5 binding proteins which are thought to be important in signal transduction and protein 
transport. 

This gene is expressed primarily in umbilical vein and microvascular endothelial 
cells, GM-CSF treated macrophage, anergic T cells, osteoblast, osteoclast, CD34+ cells 
and to a lesser extent in gall bladder. 

0 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: bone formation and 
growth, osteonecrosis, osteoporosis, angiogenesis and/or hematopoeisis. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

5 immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the skeletal and 
hematopoeisis systems, expression of this gene at significantly higher or lower levels 



WO 98/39448 



32 



PCT/US98/04493 



may be routinely detected in certain tissues and cell types (e.g., endothelial cells, blood 
cells, bone, and gall bladder, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
5 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to GTP binding proteins indicate that 
polynucleotides and polypeptides corresponding to this gene are useful for 
treatment/detection of bone formation and growth, osteonecrosis, osteoporosis, and/or 
10 hematopoeisis because its involvement in the growth signaling or angiogenesis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 45 

The translation product of this gene shares sequence homology with signal 
sequence receptor gamma subunit which is thought to be important in protein 

1 5 translocation on endoplasmic reticulum. 

This gene is expressed primarily in adrenal gland, salivary gland, prostate, and 
to a lesser extent in endothelial cells and smooth muscle. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions: protein secretion. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
secretory organs, expression of this gene at significantly higher or lower levels may be 

25 routinely detected in certain tissues and cell types (e.g., adrenal gland, salivary gland, 
prostate, endothelial cells, and smooth muscle, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

30 fluid from an individual not having the disorder. 

The tissue distribution and homology to SSR gamma subunit indicate that 
polynucleotides and polypeptides corresponding to this gene are useful for endocrine 
disorders, prostate cancer, xerostomia or sialorrhea. 

35 FEATURES OF PROTEIN ENCODED BY GENE NO: 46 

This gene is expressed primarily in osteoclastoma cells and to a lesser extent in 
melanocyte, amygdala, brain, and stomach. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: ossification, 
osteoporosis, fracture, osteonecrosis, osteosarcoma. Similarly, polypeptides and 
5 antibodies directed to these polypeptides arc useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the skeletal systems, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues and 
cell types (e.g., melanocytes, amygdala, brain and other tissue of the nervous system, 
10 and stomach, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 

plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

1 5 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful in intervention of ossification, osteoporosis, 
fracture, osteonecrosis and osteosarcoma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 48 
20 The translation product of this gene shares sequence homology with proline rich 

proteins which is thought to be important in protein-protein interaction. 
This gene is expressed primarily in brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions: neurological and 
psychological disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the central nerve system and endocrine system, expression of this gene at 

30 significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., brain and other tissue of the nervous system, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 

35 fluid from an individual not having the disorder. 

The tissue distribution and homology to proline-rich proteins indicate that 
polynucleotides and polypeptides corresponding to this gene are useful in intervention 
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and detection of neurological diseases, including trauma, neoplasia, degenerative or 
metabolic conditions in the central nerve system. Additionally, the gene product may be 
a secreted by the brain as an endocrine. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 49 

The translation product of this gene shares sequence homology with the AOCB 
gene from Aspergillus nidulans which is important in asexual development. 

This gene is expressed primarily in infant brain and to a lesser extent in the 
developing embryo, trachea tumors, B-cell lymphoma and synovial sarcoma. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: neurodegenerative 
diseases, leukemia and sarcoma's. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 

1 5 identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the brain and immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., embryonic tissue, blood cells, trachea, and synovial tissue, and cancerous 
and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or 

20 spinal fluid) or another tissue or cell sample taken from an individual having such a 
disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in infant brain and sarcoma's and homology to a gene 
involved in a key step of eukaryotivc development (fungal spore formation) indicates 

25 that the protein product of this clone could play a role in neurological diseases such as 
schizophrenia, particularly in infants. The existence of the gene in a B-cell lymphoma 
indicates the gene may be used in the treatment and detection of leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 50 

30 This gene is expressed primarily in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: pulmonary disorders 
including lung cancer. Similarly, polypeptides and antibodies directed to these 

35 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the pulmonary system, expression of this gene at significantly higher or 
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lower levels may be routinely detected in certain tissues and cell types (e.g., lung, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution of this gene only in fetal lung indicates that it plays a key 
role in development of the pulmonary system. This would suggest that misregulation of 
the expression of this protein product in the adult could lead to lymphoma or sarcoma 
formation, particularly in the lung. It may also be involved in predisposition to certain 
10 pulmonary defects such as pulmonary edema and embolism, bronchitis and cystic 
fibrosis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 51 

This gene is expressed primarily in hematopoietic cell types and fetal cells and to 

1 5 a lesser extent in all tissue types. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: defects in the immune 
system and hematopoeisis. Similarly, polypeptides and antibodies directed to these 

20 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune and hematopoietic systems, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., hematopoietic cells, and fetal tissue, and cancerous and wounded tissues) 

25 or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution of this gene predominantly in hematopoeitic cells and in 

30 the developing embryo indicates that polynucleotides and polypeptides corresponding to 
this gene are useful for detection and treatment of lymphomas and disease states 
affecting the immune system or hematopoeisis disorders such as leukemia, AIDS, 
arthritis and asthma.. 

35 FEATURES OF PROTEIN ENCODED BY GENE NO: 52 

This gene is expressed primarily in prostate and to a lesser extent in fetal spleen, 
fetal liver, infant brain and T cell leukemias. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: prostate disorders, 
prostate cancer, leukemia. Similarly, polypeptides and antibodies directed to these 
5 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, and/or prostate gland expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., thymus, spleen, liver, brain and other tissue of the nervous system, and 
10 blood cells, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

15 The tissue distribution of this gene in prostate indicates that polynucleotides and 

polypeptides corresponding to this gene are useful for detection or treatment of prostate 
disorders or prostate cancer. Its distribution in fetal liver and fetal spleen indicates it 
may play a role in the immune system and its misregulation could lead to immune 
disorders such as leukemia, arthritis and asthma. 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 53 

The translation product of this gene shares sequence homology with dynein. 
This gene is expressed primarily in brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: neurodegenerative 
diseases of the brain. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
30 particularly neuro-degenerative diseases expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues and cell types (e.g., brain 
and other tissue of the nervous system, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
35 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
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The predominant tissue distribution in the brain and homology to dynein, a 
microtubule motor protein involved in the positioning of cellular organelles and 
molecules indicates that polynucleotides and polypeptides corresponding to this gene 
are useful for detection/treatment of neurodegenerative diseases, such as Alzheimers, 
5 Huntigtons, Parkinsons diseases and shizophrenia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 54 

The translation product of this gene shares sequence homology with ubiquitin- 
conjugation protein, an enzyme which is thought to be important in the processing of 
1 0 the Huntingtons Disease causing gene. 

This gene is expressed primarily in brain and to a lesser extent in activated 
macrophages. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions: neurodegenerative 
disease states including Huntington's disease. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of brain tissues. For a number of disorders of the above 
tissues or cells, particularly of the neurological systems expression of this gene at 

20 significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., brain and other tissue of the nervous system, and blood cells, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

25 in healthy tissue or bodily fluid from an individual not having the disorder. 

The predominant tissue distribution of this gene in the brain and its homology to 
a Huntington interacting protein indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the regulation of the expression of the 
Huntington disease gene and other neurodegenerative diseases including 

30 spinocerebullar ataxia types I and III, dentatorubropallidoluysian and spinal bulbar 

muscular atrophy. In addition, the existence of elevated levels of free ubiquitin pools in 
Alzheimer's disease, Parkinson's disease and amyotrophic lateral sclerosis indicates 
that the ubiquitin pathway of protein degradation plays a role in these disease states. 
Thus, considering the gene described here is homologous to a ubiquitin-conjugation 

35 protein it may play a general role in neurodegenarative conditions. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 56 

This gene is expressed primarily in T-cells (anergic T-cells, resting T-Ceils, 
apoptotic T-cells) and lymph node (breast), as well as brain (hypothalamus, 
hippocampus, pituitary, infant brain, early-stage brain). 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: immune (e.g. 
immunodeficiencies, autoimmunities, inflammation, leukemics & lymphomas) and 
neurological (e.g. Alzheimer's disease, dementia, schizophrenia) disorders. Similarly, 

0 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the central nervous, 
hematopoietic and immune systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., blood 

5 cells, lymphoid tissue, and brain and other tissue of the nervous system, and cancerous 
and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or 
spinal fluid) or another tissue or cell sample taken from an individual having such a 
disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. 

0 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful in the intervention or detection of pathologies 
associated with the hematopoietic and immune systems, such as anemias (leukemias). 
In addition, the expression in brain (including fetal) might suggest a role in 
developmental brain defects, neuro-degenerative diseases or behavioral abnomalities 

5 (e.g. schizophrenia, Alzheimer's, dementia, depression, etc.). 

FEATURES OF PROTEIN ENCODED BY GENE NO: 57 

This gene is expressed primarily in lung, and to a lesser extent in a variety of 
other hematological cell types (e.g. Raji cells, bone marrow cell line, activated 
0 monocytes). 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: pulmonary and/or 
hematological disfunction. Similarly, polypeptides and antibodies directed to these 
5 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the vasculo-pulmonary and hematopoietic systems, expression of this 
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gene at significantly higher or lower levels may be routinely detected in certain tissues 
and cell types (e.g., lung and blood cells, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 

5 standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful in the intervention and detection of pathologies 
associated with the vasculo-pulmonary system. In addition the expression of this gene 

10 in a variety of leukocytic cell types and a bone marrow cell line might suggest a role in 
hematopoietic and immune system disorders, such as leukemias & lymphomas, 
inflammation, immunodeficiencies and autoimmunities. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 58 

1 5 The translation product of this gene shares sequence homology with adenylate 

kinase isozyme 3 (gil 163528 GTP:AMP phosphotransferase (EC 2.7.4.10) [Bos 
taurus]), which is thought to be important in catalyzing the phosphorylation of AMP to 
ADP in the presence of ATP or inorganic triphosphate. 

This gene is expressed primarily in fetal liver, heart and placenta, and to a lesser 

20 extent in many other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: hepatic, cardiovascular 
or reproductive disorders. Similarly, polypeptides and antibodies directed to these 

25 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the hepatic, cardiovascular and reproductive systems, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
and cell types (e.g., liver, heart, and placenta, and cancerous and wounded tissues) or 

30 bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

35 corresponding to this gene are useful for the treatment and diagnosis of conditions 
related to hepatic function and pathogenesis, in particular, those dealing with liver 
development and the differentiation of hepatocyte progenitor cells. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 59 

This gene is expressed primarily in CD34 positive cells (Cord Blood). 
Therefore, polynucleotides and polypeptides of the invention are useful as 

5 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: hematopoietic 
differentiation and immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

0 tissues or cells, particularly of hematopoietic and immune systems, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
and cell types (e.g., hematopoietic cells, and blood cells, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

5 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful in the detection and treatment of conditions 
associated with CD34-positive cells, and therefore as a marker for cell differentiation in 

0 hematapoiesis, as well as immunological disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 60 

The translation product of the predicted open reading frame of this contig has 
sequence identity to the murine gene designated Insulin-Like Growth Factor-Binding 
5 Protein (IGFBP)-l as described by Lee and colleagues (Hepatology 19 (3), 656-665 
(1994)). 

This gene is expressed exclusively in hemangiopericytoma. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

0 biological sample and for diagnosis of hemangiopericytoma and other pericyte or 

endothelial cell proliferative disorders. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the circulatory and immune systems, expression of this 

\5 gene at significantly higher or lower levels may routinely be detected in certain tissues 
and cell types (e.g., pericyte or endothelial cells, and liver, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
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another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

Polynucleotides and polypeptides corresponding to this gene are useful as cell 
5 growth regulators since IGFBP- 1 -like molecules function as modulators of insulin-like 
growth factor activity. In addition, since IGFBP- 1 is expressed at high levels 
following hepatectomy and during fetal liver development, polynucleotides of the 
present invention may also be used for the diagnosis of developmental disorders. 
Further, polypeptides of the present invention may be used therapeutically to treat 
1 0 developmental liver disorders as well as to regulate hepatocyte and supporting cell 
growth following hepatectomy or to treat liver disorders. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of 
hemangiopericytoma and liver disorders. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 61 

This gene is expressed primarily in schizophrenic frontal cortex. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions: nervous system and 
cognitive disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the frontal cortex and CNS expression of this gene at significantly higher 

25 or lower levels may be routinely detected in certain tissues and cell types (e.g., brain 
and other tissue of the nervous system, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

30 individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for study, treatment and diagnosis of frontal 
cortex, neuro-degenerative and CNS disorders 

35 FEATURES OF PROTEIN ENCODED BY GENE NO: 62 

This gene is expressed primarily in human adrenal gland tumor, and to a lesser 
extent in human kidney, medulla and adult pulmonary tissue. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell .ype(s) present tn a 
Ttlgica, sample and for dtagnosts of diseases and conditions: metaoohc. endoenne 
disorders. S,mi,ar.y, polypeptides and antibodies dtrected to these 

5 useM in providing immunologica. pro*s for differential tdentification of the ussuefs, 
r ceh ty^(s). For a number of dtsorders of the above tissues or ce.ls. particularly o 
th e endocrine and nervous system disorders and neoplasia, expression of Urn gene « 
stgnificantly higher or .ower ,eve!s may b. routinely detect ,n certatn tissues aid cel. 
types (e g., adrenal gland, kidney, brain and other tissue of the nervous system, 

10 umn^y tissue, and cancerous and wounded tissues, or bodily flutds 

p "a.^ne^ynovialfluidorsptnalfluidjoranothertissueorcellsampleukenrom 

TdLua. havtng such a dtsorder, relative to the standard gene expre*,on leveU,, 
^expression level ,n healthy tissue or bodily flutd from an tndividua, no, havtng the 

<ilsord ^ ^^^^^^^ nucleotides and polypeptides 
corresponding to this gene are useful for study. — and dtagnosis of neurologtca. 
and endocrine disorders including neoplasia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 63 

20 This gene is expressed primarily in human ad iP ocytes, and to a lesser extent m 

soleen. 1 2-week old human, and testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present tn a 
wXgica. samp,e and for diagnosis of diseases and conditions: immune. metabohc and 
25 ^w * disordel Similatiy, polypeptides and antibodies directed to these po ypepudes 
are useful in providing immunological probes for differentia! identification of he 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
2 U of th Lnune system, expression of mis gene at significant., lugher or 
ot r leve s may be routinely detected in certain tissues and cell types (e.g., adtpocy es, 
30 spin iLtes and other reproductive tissue, and cancerous and wounded tissues, or 
«ids (e.g„ semm, p.asma, urine, synovia, fluid or spinal fluid, or another 
tissue orcell sample taken from an individua. having such a dtsorder. relative to the 
"Lard gene expresston level, i.e., the expression level in healthy tissue or bod..y 
fluid from an individual not having the disorder, 
is The tissue distribution indicates that polynucleotides and polypeptides 

corresponds to this gene are useful for study, diagnosis and treatment of unmune, 
developmental and metabolic disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 64 

One translated product of this clone is homologous to the mouse zinc finger protein 
PZF. (See Accession No. 453376; see also Gene 152 (2), 233-238 (1995).) Preferred 

5 polypeptide fragments correspond to the highly conserved domains shared between 
mouse and man. For example, preferred polypeptide fragments comprise the amino acid 
sequence: LQCEICGFTCRQKASLNWHMKKHDADSFYQFSCN1CGKKFEKKDSVVAHKAKSH 
PEV (SEQ ID NO: 621); ITSTDILGTNPESLTQPSD (SEQ ID NO: 622); NSTSGECLLLEAEGM 
SKSY (SEQ ID NO: 623); CSGTERVSLMADGKIFVGSGSSGGTEGLVMNSDILGATTEVLIEDSD 

10 SAGP (SEQ ID NO: 624); IQYVRCEMEGCGTVLAHPRYLQHHIKYQHLLKKKYVCPHPSCGRLF 
RLQKQLLRHAKHHT (SEQ ID NO: 625); DQRDYICEYCARAFKSSHNLAVHRMIHTGEK (SEQ 
ID NO: 626); RSSRTSVSRHRDTENTRSSRSKTGSLQLICKSEPNTDQLDY (SEQ ID NO: 627); 
PFKDDPRDETYKPHLERETPKPRRKSG (SEQ ID NO: 630); QYVRCEMEGCGTVLAHPRYLQ 
HHIKYQHLLKKKYVCPHPSCGRLFRLQKQLLRHAKHHTD (SEQ ID NO: 629); or residues 

15 151-182 of QRDYICEYCARAFKSSHNLAVHRMIHTGEKHY (SEQ ID NO: 628). Also preferred 
are polynucleotide fragments encoding these polypeptide fragments. 

This gene is expressed primarily in Rhabdomyosarcoma, melanocyte and colon 
cancer tissue and to a lesser extent in smooth muscle, pancreatic tumor, and apoptotic 
T-cells. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to,. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
25 or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune and hemopoetic, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues and cell types (e.g., striated muscle, 
melanocytes, colon, smooth muscle, pancreas, and blood cells, and cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 

30 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for study, diagnosis and treatment of cancer and 

35 hemopoetic disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 65 

This gene is expressed prunan.y in human ad.pose and salivary gland ussue and 
,o a lesser extent in human bone marrow and fetal kidney. 

Therefore, polynucleotides and polypeptides of the invennon are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present .n a 
Z ical sample and for diagnosis of diseases and conditions: metabouc an — 
lis. S tmlly. polypeptides and antibCies directed to these polype ^ 
useful in proving —logical probes for differential tdennfic* on of^-*) 
or cell type(s) For a number of disorders of the above ttssues or cells, parttcularly of 

em S and hemopcetic systems, expression of.his gene a, « 
lower !eve,s may be routinely detected ,n certain ussues and eel, 
saliva gland, bone marrow, and kidney, attd cancerous and wounded ussues or 
I7« ids (.*. -m, plasma, urine, synovia, fluid or spma, fiu,d> or anc*. 
tissue or cell sample taken from an individual having such a dtsorder relattve to the 
ZLd gene expression level, i.e., the expression level in healthy ussue orbodtly 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
cotresponding ,0 this gene are useful for study, diagnosis of metabolic and immune 
disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 66 

™3Sd product of tins gene was recently identified as oxy^cinase spl.ee vanaut 

Ms7a cession No, 2209276 and d.0,0O7 8 .) Preferred polypeptide fragments 

ompTi". antacid seance.™ 

".ZLm CSEQ ,D N 0; Also preferred are polynucleotide fragments encodmg 

this polypeptide fragment. 

FEATURES OF PROTEIN ENCODED BY GENE NO: .7 

This gene is expressed primarily in hemopoetic cells, particularly apoptotic T- 
cells and to lesser extern in primary dendritic cells and adipose ussue. 

Therefore, polynucleotides and polypeptides of the tnvention are useful as 
reagems lor differential •denuficauon of apoptotic T-ceUs. pnmary denrmc c= Is. and 
adipose tissue present in a biologtca, sample and for diagnosts of dtseases and 
« conations: hemopoetic diseases Including cancer and general 

Stmilarly. polypeptides and antibodies directed to these *y^M£**« 
providing immunological probes fo, differential identtficauon of the ussue(s) 
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type(s). For a number of disorders of the above tissues or cells, particularly of the oral 
and intestinal mucosa as well as hemopoetic and immune systems, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
and cell types (e.g., hematopoietic cells, and cancerous and wounded tissues) or bodily 
5 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
10 corresponding to this gene are useful for treatment of diseases of the immune system, 
including cancer, hemopoetic and infectious diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 68 

This gene is expressed primarily in kidney cortex and to a lesser extent in infant 

15 brain, heart, uterus, and blood. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of kidney tissue present in a biological sample and 
for diagnosis of diseases and conditions: soft tissue cancer, inflammation, kidney 
fibrosis. Similarly, polypeptides and antibodies directed to these polypeptides are useful 

20 in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
nervous and endocrines systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., kidney, 
brain, and other nervous tissue, heart, uterus, and blood cells, and cancerous and 

25 wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

30 corresponding to this gene are useful for study and treatment of cancer and fibroses. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 69 

The translation product of this gene shares strong sequence homology with 
vertebrate and invertebrate protein tyrosine phosphatases. 
35 This gene is expressed primarily in endometrial tumors, melanocytes, myeloid 

progenitors and to a lesser extent in infant brain, adipocytes, and several hematopoietic 
stem cells. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of transformed hematopoietic and epithelial cells 
present in a biological sample and for diagnosis of diseases and conditions which 
include, but are not limited to, of skin and endometrium, leukemia. Similarly, 
5 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the nervous and 
hemopoietic systems, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues and cell types (e.g., endometrium, melanocytes, 

10 bone marrow, adipocytes, hematopoietic cells, and brain and other tissue of the nervous 
system, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

15 disorder. 

The tissue distribution and sequence similarity with tyrosine phosphatases 
indicate that polynucleotides and polypeptides corresponding to this gene are useful for 
study and treatment of cancer and hematopoietic disorders. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 70 

This gene is expressed primarily in osteoclastoma, breast, and infant brain and 
to a lesser extent in various fetal and transformed bone, ovarian, and neuronal cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions: degenerative conditions 
of the brain and skeleton. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the nervous and skeletal system, expression of this gene at significantly 

30 higher or lower levels may be routinely detected in certain tissues and cell types (e.g., 
bone, mammary tissue, and brain and other tissue of the nervous system, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

35 in healthy tissue or bodily fluid from an individual not having the disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for study and treatment of degenerative, 
neurological and skeletal disorders. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 71 

This gene was originally cloned from tumor cell lines. Recently another group 
has also cloned this gene, calling it the human malignant melanoma metastasis- 
suppressor (KiSS-1) gene. (See Accession No. U43527.) Preferred polypeptide 
fragments comprise the amino acid sequence: LEKVASVGNSRPTGQQLESLGLLA (SEQ ID 

10 NO: 632); VHREEASCYCQAEPSGDL (SEQ ID NO: 633); RPALRQAGGGTREPRQKRWAGL 
(SEQ ID NO: 634); and AVNFRPQRSQSM (SEQ ID NO: 635). Any frame shifts can easily be 
resolved using known molecular biology techniques. 

This gene is expressed primarily in many types of carcinomas and to a lesser 
extent in many normal organs. 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissues(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer particularly melanomas, and other hyperproliferative disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

20 providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of 
transformed organ tissue, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues and cell types (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

25 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. As a tumor suppressor gene, increase 
amounts of the polypeptide can be used to treat patients having a particular cancer. 
The tissue distribution indicates that this gene and the translated product is 

30 useful for diagnosing and study of cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 72 

This gene is expressed primarily in striatum and to a lesser extent in adipocytes 
and hemangioperiocytoma. 
35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of striatal cells present in a biological sample and 
for diagnosis of diseases and conditions: neurological, fat and lysosomal storage 
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diseases. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the nervous and immune systems, expression of this gene at significantly higher or 
5 lower levels may be routinely detected in certain tissues and cell types (e.g., striatal 
tissue, adipocytes, and vascular tissue, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
10 individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis, study and treatment of 
neurodegenerative and growth disorders. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 73 

This gene is expressed primarily in bone marrow stromal cells and to a lesser 
extent in smooth muscle, testes, endothelium, and brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of bone marrow present in a biological sample and 

20 for diagnosis of diseases and conditions: connective tissue and hematopoietic diseases. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
skeletal and hematopoietic systems, expression of this gene at significantly higher or 

25 lower levels may be routinely detected in certain tissues and cell types (e.g., bone 
marrow, stromal cells, smooth muscle, testes and other reproductive tissue, 
endothelium, brain and other tissue of the nervous system, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

30 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for study, diagnosis, and treatment of connective 
tissue and blood diseases. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 74 

This gene is expressed primarily in brain, fetal liver and lung and to a lesser 
extent in retina, spinal chord, activated T-cells and endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of brain and regenerating liver present in a 

biological sample and for diagnosis of diseases and conditions: CNS and spinal chord 
injuries, immune disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 

10 particularly of the nervous and immune system, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues and cell types (e.g., 
brain and other tissue of the nervous system, liver, pulmonary tissue, blood cells, and 
endothelial cells, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 

15 an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for study and treatment of hematopoietic and 
20 neurological conditions. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 75 

The translation product of this gene shares sequence homology with GTP 
binding proteins (intracellular). 

25 This gene is expressed primarily in bone marrow, brain, and melanocytes and to 

a lesser extent in various endocrine and hematopoietic tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: hematopietic and 

30 nervous system conditions. Similarly, polypeptides and antibodies directed to these 

polypeptides are useful in providing immunological probes for differentia] identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the nervous and immune, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues and cell types (e.g., bone 

35 marrow, melanocytes, brain and other tissue of the nervous system, and cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
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relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to nucleotide binding factors indicate that 
polynucleotides and polypeptides corresponding to this gene are useful for study, 
5 diagnosis, and treatment of brain degenerative, skin and blood diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 76 

This gene is expressed primarily in activated T-cells and to a lesser extent in 
retina, brain, and fetal bone. 

0 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of activated T-cells and developing brain present 
in a biological sample and for diagnosis of diseases and conditions: immune 
deficiencies and skeletal and neuronal growth disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

5 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the nervous, immune, and skeletomuscular 
sustems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues and cell types (e.g., blood cells, brain and other tissue of the 
nervous system, retinal tissue, and bone, and cancerous and wounded tissues) or 

0 bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) Or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

5 corresponding to this gene are useful for diagnosis, study and treatment of cancer, 
urogenital, and brain degenerative diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 77 

This gene is expressed primarily in fetal liver, activated monocytes, osteoblasts 
0 and to a lesser extent in synovial, brain, and lymphoid tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of myeloid and lymphoid present in a biological 
sample and for diagnosis of diseases and conditions: inflammation, immune 
deficiencies, cancer. Similarly, polypeptides and antibodies directed to these 
15 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system and skeleton, expression of this gene at significantly 
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higher or lower levels may be routinely detected in certain tissues and cell types (e.g., 
liver, blood cells, bone, synovial tissue, brain and other tissue of the nervous system, 
and lymphoid tissue, and cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
5 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for study, diagnosis, and treatment of lymphoid 
10 and mesenchymal cancers and nervous system diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 78 

The translation product of this gene shares sequence homology with polymerase 
polyprotein precursor which is thought to be important in DNA repair and replication 

15 This gene is expressed primarily in infant brain and to a lesser extent in tumors 

and tumor cell lines 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

20 not limited to, especially of the neural system and developing organs. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the neural system 
expression of this gene at significantly higher or lower levels may be routinely detected 

25 in certain (e.g., brain and other tissue of the nervous system, and cancerous and 

wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

30 The tissue distribution and homology to polymerase polyprotein precursor 

indicate that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of cancers especially of the neural system and developing 
organs 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 79 

This gene is expressed primarily in muscle and endothelial cells and to a lesser 
extent in brain. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: vascular diseases. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

5 providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
vascular system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain (e.g., muscle, endothelial cells, brain and other tissue of the 
nervous system, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
1 5 corresponding to this gene are useful for treatment and diagnosis of disorders of the 
vascular and neural system including cardiovascular and endothelial. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 80 

This gene is expressed primarily in placenta and to a lesser extent in fetal liver 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: developmental disorders 
and disorder of the haemopoietic system, fetal liver and placenta. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

25 immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of developmental 
disorders and disorder of the haemopoietic system, fetal liver and placenta, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues and cell types (e.g., placenta and liver, and cancerous and wounded tissues) or 

30 bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

35 corresponding to this gene are useful for diagnosis and treatment of developmental 
disorders and disorders of the haemopoietic system, fetal liver and placenta. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 81 

This gene is expressed primarily in bone marrow, placenta and tissues and 
organs of the hematopoietic system. 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: disorders of the bone 
and haemopoietic system. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 

10 of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune, bone and hematopoietic system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., bone marrow, placenta, and hematopoietic cells, and cancerous and 
wounded tissues) or bodily fluids (e.g., scrum, plasma, urine, synovial fluid or spinal 

15 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of disorders of the 

20 immune, bone and hematopoietic system 

FEATURES OF PROTEIN ENCODED BY GENE NO: 82 

The translation product of this gene shares sequence homology with secretory 
carrier membrane protein which is thought to be important in protein transport and 

25 export. Any frame shifts in coding sequence can be easily resolved using standard 
molecular biology techniques. Another group recently cloned this gene, calling it 
SCAMP. (See Accession No. 2232243.) 

This gene is expressed primarily in prostate, breast and spleen, and to a lesser 
extent in several other tissues and organs. 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: disorders of the breast 
prostate and spleen. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 

35 of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly disorders of the breast prostate and spleen, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
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types (e.g., prostate, mammary tissue, and spleen, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
5 fluid from an individual not having the disorder. 

The tissue distribution and homology to secretory carrier membrane protein 
indicate that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of disorders of the breast, prostate and spleen. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 83 

This gene is expressed primarily in developing organs and tissue like placenta 
and infant brain and to a lesser extent in developed organs and tissue like cerebellum 
and heart. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: neurological diseases. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
20 neural system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., placenta, heart, brain and other 
tissue of the nervous system, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
25 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of diseases of the 
neural system including neurological disorders and cancer. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 84 

The translation product of this gene shares sequence homology with ATPase 6 

in Trypanosoma brucei which is thought to be important in metabolism. 

This gene is expressed primarily in tumor and fetal tissues and to a lesser extent 
35 in melanocytes, kidney cortex, monocytes and ovary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions: metabolism disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the fetal 

5 systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues and cell types (e.g., fetal tissues, melanocytes, kidney, blood 
cells, ovary and other tissue of the reproductive system, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

0 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to ATPase indicate that polynucleotides 
and polypeptides corresponding to this gene are useful for treatment and diagnosis of 
metabolism disorders, especially in fetal and tumor tissue growth. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 85 

The translation product of this gene shares sequence homology with the 
immunoglobulin superfamily of proteins which are known to be important in immune 
response and immunity. 

0 This gene is expressed primarily in stromal cells, colon cancer, lung, amygdala, 

melanocyte and to a lesser extent in a variety of other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: defects of stromal cell 

5 development and cancer. Similarly, polypeptides and antibodies directed to these 

polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the stromal cells, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues and cell types (e.g., stromal cells, 

0 colon, lung, amygdala, and melanocytes, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

5 The tissue distribution and homology to immunoglobulin indicate that 

polynucleotides and polypeptides corresponding to this gene are useful for treatment 
and diagnosis of immune system disorders. 



WO 98/39448 



56 



PCT/US98/04493 



FEATURES OF PROTEIN ENCODED BY GENE NO: 86 

The translation product of this gene shares sequence homology with 
transcription iniation factor eIF-4 gamma which is thought to be important in gene 
5 transcription. 

This gene is expressed primarily in tumor tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tumorigenesis. 

10 Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly in tumor 
tissues, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues and cell types (e.g., endometrium and lung, and cancerous 

15 and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or 
spinal fluid) or another tissue or cell sample taken from an individual having such a 
disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to transcription iniation factor eDF-4 

20 gamma indicate that polynucleotides and polypeptides corresponding to this gene are 
useful for gene regulation in tumorigenesis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 87 

The translation product of this gene shares sequence homology at low level in 

25 prolines with secreted basic proline-rich peptide II-2 which is thought to be important in 
protein structure or inhibiting hydroxyapatite formation in vitro. 

This gene is expressed primarily in endometrial tumor and fetal lung. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions: endometrial tumors. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
muscular/skeletal and reproductive systems, expression of this gene at significantly 

35 higher or lower levels may be routinely detected in certain tissues and cell types (e.g., 
endometrium, and lung, and cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
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taken from an individual having such a disorder, relative to the standard gene 
expression level i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to secreted basic proline-rich peptide II-2 
5 indicate that polynucleotides and polypeptides corresponding to this gene are useful for 
inhibiting hydroxy apatite formation or establishing cell/tissue structure. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 88 

This gene is expressed primarily in: amniotic cells inducted with TNF in culture; 
10 and to a lesser extent in colon tissue from a patient with Crohn's Disease; parathyroid 
tumor; activated T-cells; cells of the human Caco-2 cell line; adenocarcinoma; colon; 
corpus colosum; fetal kidney; pancreas tumor; fetal brain; early stage brain, and anergic 
T-cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tumors. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the immune system; 

20 e.g., tumors, expression of this gene at significantly higher or lower levels may be 

routinely detected in certain (e.g., amniotic cells, colon, kidney, pancreas, parathyroid, 
brain and other tissue of the nervous system, blood cells, hematopoietic cells, liver, 
spleen, bone, testes and other reproductive tissue, brain and other tissue of the nervous 
system, and epithelial cells, and cancerous and wounded tissues) or bodily fluids (e.g., 

25 serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that the protein product of this clone is useful 

30 for modulating tumorigenesis and other immune system conditions such as disorders in 
immune response. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 89 

This gene is expressed primarily in fetal liver/spleen and hematopoietic cells and 
35 to a lesser extent in brain, osteosarcoma, and testis tumor. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions: leukemia and 
hematopoietic disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 

5 particularly of the hematopoietic and immune systems, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., hematopoietic cells, liver, spleen, bone, testes, and other reproductive 
tissue, brain and other tissue of the nervous system, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

0 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of hematopoietic and 

5 immune disorders. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 90 

The translation product of this gene shares weak sequence homology with 
mouse Gcap 1 protein which is developmentally regulated in brain. 

This gene is expressed primarily in infant and adult brain and fetal liver/spleen 
and to a lesser extent in smooth muscle, T cells, and a variety of other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: neurological or 
hematopoietic disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the nervous, hematopoietic, immune, and endocrine systems, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues and cell types (e.g., brain and other tissue of the nervous system, blood cells, 
liver, spleen ,and smooth muscle, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
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The tissue distribution and its homology to Gcapl protein indicate that 
polynucleotides and polypeptides corresponding to this gene are useful for treatubg and 
diagnosis of disorders in neuronal, hematopoietic, immune, and endocrine systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 91 

This gene is expressed primarily in brain and hematopoietic cells and to a lesser 
extent in tumor tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: disorder in nervous, 
hematopoietic, immune systems and tumorigenesis. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the in nervous, hematopoietic, immune 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues and cell types (e.g., brain and other tissue of the nervous 
system, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein product of this clone is useful 
for diagnosis and treatment of disorders in the nervous, hematopoietic, and immune 
systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 92 

The translation product of this gene shares sequence homology with 
neuroendocrine-specific protein A which is thought to be important in neurologic 
systems. 

This gene is expressed primarily in brain tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type£s) present in a 
biological sample and for diagnosis of diseases and conditions: neural disorders and 
degeneration disease. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the central or peripheral nervous systems, expression of this gene at 
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significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., hematopoietic cells, and brain and other tissue of the nervous system, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
5 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to neuroendocrine-specific protein A 
indicate that polynucleotides and polypeptides corresponding to this gene are useful for 
treatment or diagnosis of neural disorders and degeneration disease. 

10 

FEATURES OF PROTEIN ENCODED BY GENE NO: 93 

The translation product of this gene shares sequence homology with collagen- 
like protein and prolin-rich protein which are thought to be important in connective 
tissue function and tissue structure. 

15 This gene is expressed primarily in fetal liver/spleen and brain tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: neuronal or 
hematopoietic disorders. Similarly, polypeptides and antibodies directed to these 

20 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the nervous and hematopoietic systems, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., liver, spleen, and brain and other tissue of the nervous system, and 

25 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to collagen-like protein and proline-rich 

30 proteins indicate that polynucleotides and polypeptides corresponding to this gene are 
useful for supporting brain and hematopoietic tissue function and diagnosis and 
treatment of disorders in these functions. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 94 

35 This gene is expressed primarily in embryonic tissues and tumor tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to,. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

5 the immune system (e.g., tumors), expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., embryonic 
tissue and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 

0 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of cancer. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 95 

This gene is expressed primarily in brain tumor, placenta, and melanoma. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: brain tumor or 

0 melanoma. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the brain or melanocytes, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues and cell types (e.g., brain and other tissue of 

5 the nervous system, placenta, and melanocytes, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

0 The tissue distribution indicates that the translation product of this gene is useful 

in the diagnosis and treatment of brain tumors and melanoma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 96 

The translation product of this gene shares sequence homology with a yeast 
\5 membrane protein, SUR4, which encodes for APA 1 that acts on a glucose-signaling 
pathway that controls the expression of several genes that are transcriptionally regulated 
by glucose. 
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This gene is expressed primarily in fetal liver, and to a lesser extent in placenta 
and breast tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

5 biological sample and for diagnosis of diseases and conditions: defects of fetal liver or 
defects of glucose-regulated ATPase activities in tissues. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the fetal immune/hematopoietic system, 

0 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., liver, placenta, and mammary tissue, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to yeast SUR4 membrane protein indicate 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of defects of fetal liver or defects of glucose-regulated ATPase 
activities. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 97 

This gene is expressed primarily in fetal liver, brain, and amniotic fluid. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: defects of the fetal 
immune system and adult brain. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the fetal immune system and adult brain, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., liver, and brain and other tissue of the nervous system, and cancerous and 
wounded tissues) or bodily fluids (e.g., amniotic fluid, serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein product of this clone is useful 
for detecting defects of the fetal immune and hematopoietic systems since fetal liver is 
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the predominant organ responsible for hematopoiesis in the fetus. In addition, the gene 
product of this gene is thought to be useful for detecting certain neurological defects of 
the brain. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 98 

The translation product of this gene shares sequence homology with an yolk 
protein precursor, Vitellogenin which is thought to be important in binding lipids such 
as phosvitin. 

This gene is expressed primarily in amniotic cells and fetal liver. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: defects in amniotic 
cells, fetal liver development and the fetal immune system. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

1 5 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the [insert system where a related disease 
state is likely, e.g., immune], expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues and cell types (e.g., amniotic cells, 
and liver, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

20 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to vitellogenin indicate that the protein 
25 product of this clone is useful for treatment and diagnosis of defects in amniotic cells, 
fetal liver development and the fetal immune system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 99 

This gene is expressed primarily in placenta, endometrial tumor, osteosarcoma 
30 and stromal cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tumor of the 
endometrium or bone, and osteosarcoma. Similarly, polypeptides and antibodies 
35 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the obstetric system (e.g. placenta, 
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endometrium) and the bones, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues and cell types (e.g., placenta, 
endometrium, bone, and stromal cells, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 

5 cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of tumors and 

0 abnormalities of the endometrium, and the bones because of its abundance in the 
aforementioned tissues.. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 100 
This gene is expressed primarily in hepatocellular tumor. 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions; hepatocellular tumor. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 

0 type(s). For a number of disorders of the above tissues or cells, particularly of the liver, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., liver, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 

5 standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein product of this clone is useful 
for diagnosis and treatment of hepatocellular cancer because of its abundant expression 
in this tissue. 

0 

FEATURES OF PROTEIN ENCODED BY GENE NO: 101 

This gene is expressed primarily in Corpus Colosum, fetal lung and infant 

brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: defects of the Corpus 
Colosum or defects of the fetal lung. Similarly, polypeptides and antibodies directed to 
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these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the Corpus Colosum and brain in general, and fetal lung, 
expression of this gene at significantly higher or lower levels may be routinely detected 

5 in certain tissues and cell types (e.g., lung, and brain and other tissue of the nervous 
system, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

10 disorder. 

The tissue distribution indicates that the protein product of this clone is useful 
for diagnosis and treatment of defects of the Corpus Colosum and brain in general, and 
defects of fetal lung. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 102 

This gene is expressed primarily in T cells and stromal cells, and to a lesser 
extent in adrenal gland. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions: defects of T cell 

immunity and stromal cell development. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 

25 significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., blood cells, stromal cells, and adrenal gland, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level i.e., the expression level in healthy tissue or bodily 

30 fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein product of this clone is useful 
for diagnosis and treatment of defects of T cell immunity and stromal cell development 
because of its abundant expression in these tissues. 

35 FEATURES OF PROTEIN ENCODED BY GENE NO: 103 

This gene is expressed primarily in infant brain and placenta. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: defects of the brain and 
nervous system. Similarly, polypeptides and antibodies directed to these polypeptides 

5 are useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the nervous system, especially brain, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., brain and other tissue of the nervous system, and placenta, cancerous and 

10 wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein product of this clone is useful 

1 5 for detecting defects of the brain, especially in young children. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 105 

This gene is expressed primarily in human osteoclastoma and to a lesser extent 
in human pancreas tumor. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer particularly osteoclastoma and pancreatic tumor. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

25 immunological probes for differential identification of the tissue(s) or cell typc(s). For a 
number of disorders of the above tissues or cells, particularly in transformed tissues, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., bone and pancreas, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

30 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein product of this clone is useful 
for diagnosis and treatment of some types of tumors, particularly pancreatic cancer and 

35 osteoclastoma. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 106 

This gene is expressed primarily in fetal liver/spleen, and to a lesser extent in 
activated T-Cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: immune disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 

10 immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., liver, spleen, and blood cells, 
and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 

15 expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis or treatment of immune disorders. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 107 

This gene is expressed primarily in human embryo and to a lesser extent in 
spleen and chronic lymphocytic leukemia. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions: leukemia. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the immune or 
hemopoietic systems, expression of this gene at significantly higher or lower levels may 

30 be routinely detected in certain tissues and cell types (e.g., embryonic tissue, spleen, 
and blood cells, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. 

The tissue distribution indicates that the protein product of this clone is useful 
for the diagnosis and treatment of leukemia. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 108 

This gene is expressed primarily in placenta, and to a lesser extent in early stage 
human brain and in lung. 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: fetal developmental 
abnormalities. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 

10 or cell type(s). For a number of disorders of the above tissues or cells, particularly in 
fetal and amniotic tissue, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues and cell types (e.g., placenta, brain and 
other tissue of the nervous system, and lung, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 

1 5 tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein product of this is useful for 
production of growth factor(s) associated with fetal development. Preferred 

20 polypeptides comprise the full-length polypeptide shown in the sequence listing, 
truncated however, at the amino terminus and beginning with QTE. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 109 

This gene is expressed primarily in fetal spleen, and to a lesser extent in B-Cell 
25 lymphoma and T-Cell lymphoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: lymphoma. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
30 immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., spleen and blood cells, and cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
35 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 
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The tissue distribution indicates that the protein product of this clone is useful 
for the treatment and diagnosis of human lymphomas. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 110 

5 The translation product of this gene shares sequence homology with sarcoma 

amplified sequence (SAS), a tetraspan receptor which is thought to be important in 
malignant fibrous histiocytoma and liposarcoma. 

This gene is expressed primarily in human osteoclastoma, and to a lesser extent 
in pineal gland and infant brain. 

0 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: malignant fibrous 
histiocytoma and liposarcoma. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 

5 of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., bone, 
pineal gland, and brain and other tissue of the nervous system, and cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 

3 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to sarcoma amplified sequence (SAS) 
indicate that the protein product of this clone is useful for treatment of, osteosarcoma, 

5 malignant fibrous histiocytoma and liposarcoma and related cancers, particularly 
sarcomas. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 111 

The translation product of this gene shares sequence homology with 6.8K 
3 proteolipid protein, mitochondrial - bovine. 

This gene is expressed primarily in Wilm's tumor and to a lesser extent in 
cerebellum and placenta. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions: Wilm's tumor. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
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type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune or renal systems, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues and cell types (e.g., brain and other tissue of 
the nervous system, and placenta, and cancerous and wounded tissues) or bodily fluids 
5 (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to 6.8K proteolipid protein indicate that 
10 the protein product of this clone is useful for diagnostic and therapeutics associated with 
tumors, particularly Wilm's tumor disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 112 

This gene is expressed primarily in embryonic tissue and to a lesser extent in 

1 5 osteoblasts, endothelial cells, macrophages (GM-CSF treated), and bone marrow. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: immune disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

20 providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., embryonic tissue, bone, 
endothelial cells, blood cells and bone marrow, and cancerous and wounded tissues) or 

25 bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

30 corresponding to this gene are useful for treatment or diagnosis of immune disorders. 
Preferred polypeptides encoded by this gene comprise the following amino acid 
sequence: MITDVQLAIFANMLGVSLFLLWLYHYVAVNNPKKQE (SEQ ID NO: 636). 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 113 

This gene is expressed primarily in hepatocellular tumor, and to a lesser extent 
in fetal liver/spleen. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tumors, particularly 
hepatocellular tumors. Similarly, polypeptides and antibodies directed to these 
5 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the hepatic system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., liver, and 
spleen, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
10 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein product of this clone is useful 
15 for diagnosis and treatment of tumors, particularly hepatocellular tumors. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 114 

The translation product of this gene exhibits a very high degree of sequence 
identity with the human Pig8 gene which is thought to be important in p53 mediated 

20 apoptosis. The sequence of this gene has since been published by Polyak and 

colleagues (Nature 389, 300-306 (1997)). In addition, the predicted translation product 
of this contig exhibits very high sequence homology with a murine gene denoted as 
EI24 which is also thought to be important in p53 mediated apoptosis. 

This gene is expressed primarily in infant brain and activated T-cells and to a 

25 lesser extent in bone marrow, fetal liver, and prostate. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, and tissue damage by radiation and anti-cancer drugs. Similarly, 

30 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the nervous and 
immune systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., brain and other tissue of the 

35 nervous system, blood cells, bone marrow, liver, and prostate, and cancerous and 

wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
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relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to human Pig8 and murine EI24 genes 
indicate that polynucleotides and polypeptides corresponding to this gene are useful for 
5 preventing apoptosis in patients being treated with anti-oncogenic drugs such as 

etoposide, hydroperoxycyclophosphamide, and X-irradiation, since this protein product 
is upregulated in cells undergoing such treatment where p53 was overexpressed. It may 
also be useful in the treatment of hematopoietic disorders and in boosting numbers of 
hematopoietic stem cells by interfering with the apoptosis of progenitor cells. The 
10 mature polypeptide is predicted to comprise the following amino acid sequence: 

EEMADSVKTFLQDlJ^RGIKDSrWGICT 

EPRIVSRIFQCCAWNGGVFWFSLLLFYRVHPVLQSVTARIIGDPSLHGDVWSWLEFFLTSIFSA 
LWVLPLFVLSKVVNAIWQDIADL^ 
FPIHLVGQLVSLLHMSLLYSLYCreYRWFNK^ 
15 SSYIISGCLFSILFPLFIISANEAKTPGKAYLFQ^ 

FPSPHPSPAKLKATAGH (SEQ ID NO: 637). Accordingly, polypeptides comprising the 
foregoing amino acid sequence are provided as are polynucleotides encoded such 
polypeptides. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 115 

This gene is expressed primarily in stromal cells and to a lesser extent in 
multiple sclerosis. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions: affecting the nervous 
system. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
nervous system, expression of this gene at significantly higher or lower levels may be 

30 routinely detected in certain tissues and cell types (e.g., stromal cells and cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

35 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment and diagnosis of multiple sclerosis 
and other autoimmune diseases. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 116 
This gene is expressed primarily in the gall bladder 
Therefore, polynucleotides and polypeptides of the invention are useful as 

5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: gall stones or infection 
of the digestive system . Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 

0 particularly of the digestive system or renal system, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues and ceil 
types (e.g., gall bladder and tissue of the digestive system, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

5 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for possible prevention of digestive disorders 
where there may be a lack of digestive enzymes produced or in the detection and 

0 possible prevention of gall stones. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 117 

The translation product of this gene shares sequence homology with dystrophin 
gene which is thought to be important in building and maintenance of muscles. 

5 This gene is expressed primarily in placenta and to a lesser extent in fetal brain 

and fetal liver, and spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: muscular dystropy, 

0 Duchenne and Becker's muscular dystropies. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the skeletal muscle system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 

5 and cell types (e.g., placenta, brain and other tissue of the nervous system, muscle, 
liver, and spleen, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
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an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to the dystrophin gene indicate that 
5 polynucleotides and polypeptides corresponding to this gene are useful for diseases 
related the degenerative myopathies that are characterized by the weakness and atrophy 
of muscles without neural degradation; such as Duchenne and Becker's muscular 
dystropies. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 118 

This gene is expressed primarily in olfactory tissue and to a lesser extent in 
cartilage. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions: connective tissue 
diseases; chondrosarcoma. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the connective tissue, expression of this gene at significantly higher or 

20 lower levels may be routinely detected in certain tissues and cell types (e.g., olfactory 
tissue and cartilage, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

25 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for tumors of connective tissues, osteoarthritis 
and the treatment and diagnosis of chondrosarcoma. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 119 

This gene is expressed primarily in Activated Neutrophils and to a lesser extent 
in fetal spleen, and CD34 positive cells from cord blood. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions: allergies, defects in 
hematopoiesis and inflammation. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
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identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system and hematopoiesis system the, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., blood cells, and spleen, and cancerous and 
5 wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
10 corresponding to this gene are useful for reducing the allergic effects felt by allergy 
suffers by neutralizing the activity of the immune system, especially since neutrophils 
are abundant in persons suffering from allergies and other inflammatory conditions. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 120 

15 The translation product of this gene shares sequence homology with poly A 

binding protein II which is thought to be important in RNA binding for transcription of 
RNA to DN A 

This gene is expressed primarily in colon and to a lesser extent in brain and 
immune system. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: colon cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 

25 number of disorders of the above tissues or cells, particularly of the immune and 

digestive system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., colon, tissue and cells of the 
immune system, and brain or other tissue of the nervous system, and cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 

30 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to poly A binding protein II indicate that 
polynucleotides and polypeptides corresponding to this gene are useful for detection 

35 and treatment of colon cancer and other disorders of the digestive system.. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 121 

The translation product of this gene shares sequence homology with thymidine 
diphosphoglucose 4.6 dehydrase which is thought to be important in the metabolism of 
sugar. 

5 This gene is expressed primarily in fetal liver and spleen and to a lesser extent in 

infant brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: diabetes. Similarly, 

10 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the endocrine system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., liver, spleen, and brain and other tissue of the 

15 nervous system, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

20 The tissue distribution and homology to thymidine diphospoglucose 4.6 

dehydrase indicate that polynucleotides and polypeptides corresponding to this gene are 
useful for treatment of persons with diabetes since it appears that this protein is needed 
in the metabolism of sugar in to its more basic components. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 122 

The translation product of this gene shares sequence homology with 
ceruloplasmin which is thought to be important in the metabolism and transport of iron 
and copper. Ceruloplasmin also contains domains with homology to clotting factors V 
and VIII. Defects in the circulating levels of ceruloplasmin (aceruloplasminemia) have 
30 been associated with certain disease conditions such as Wilson disease, and the 
accompanying hepatolenticular degeneration. 

This gene is expressed primarily in brain and retina and to a lesser extent in 
endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: diseases marked by 
defects in iron metabolism; aceruloplasminemia not characterized by defects in the 
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known ceruloplasmin gene locus; nonclassical Wilson disease; movement disorders; 
and tumors derived from a brain tissue origin. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 

5 the above tissues or cells, particularly of the brain, retina, and nervous system, 

expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., brain and other tissue of the nervous system, 
retinal tissue, and endothelial cells, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 

10 cell sample taken from an individual having such a disorder, relative to the standard 

gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to ceruloplasmin indicate that 
polynucleotides and polypeptides corresponding to this gene are useful for treatment of 

15 patients with aceruloplasminemia, or other defects in iron and/or copper metabolism. 
Mutations in this locus could also be diagnostic for patients currently experiencing or 
predicted to experience aceruloplasminemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 123 

20 This gene is expressed primarily in brain and B cell lymphoma and to a lesser 

extent in fetal liver and spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: B cell lymphoma; 

25 tumors and diseases of the brain and/or spleen; hematopoietic defects. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the brain and 
hematopoietic system, expression of this gene at significantly higher or lower levels 

30 may be routinely detected in certain tissues and cell types (e.g., brain and other tissue of 
the nervous system, blood cells, liver, and spleen, and cancerous and wounded tissues) 
or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

35 fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment of disorders in neuronal, 
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hematopoietic, and immune systems. It could potentially be useful for 
neurodegenerative disorders and neuronal and/or hematopoietic cell survival or 
proliferation. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 124 

This gene is expressed primarily in osteoclastoma, dermatofibrosarcoma, and B 
cell lymphoma and to a lesser extent in endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

10 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer in particular osteoclastoma, dermatofibrosarcoma, and B cell 
lymphoma. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

1 5 the bone, immune, and circulatory system, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues and cell types (e.g., 
bone, epidermis, blood cells, and endothelial cells, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

20 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of cancers and 
lymphoma; osteoporosis; and the control of cell proliferation and/or differentiation. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 125 

This gene is expressed primarily in immune tissues and hematopoietic cells, 
particularly in activated T cells and neutrophils, spleen, and fetal liver, and to a lesser 
extent in infant adrenal gland. 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents tor differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: defects in T cell 
activation; hematopoietic disorders; tumors of a hematopoietic and/or adrenal gland 
origin. Similarly, polypeptides and antibodies directed to these polypeptides are useful 

35 in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
hematopoietic and/or endocrine systems, expression of this gene at significantly higher 
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or lower levels may be routinely detected in certain tissues and cell types (e.g., cells and 
tissues of the immune system, hematopoietic cells, blood cells, liver, and adrenal gland, 
and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
5 having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for immune and/or hematopoietic disorders; 
10 diseases related to proliferation and/or differentiation of hematopoietic cells; defects in T 
cell and neutrophil activation and responsiveness; and endocrine and/or metabolic 
disorders, particularly of early childhood. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 126 

15 This gene is expressed primarily in placenta and endothelial cells and to a lesser 

extent in melanocytes and embryonic tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tumors of an endothelial 

20 cell origin; angiogenesis associated with tumor development and metastasis. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the vascular system 
and developing embryo, expression of this gene at significantly higher or lower levels 

25 may be routinely detected in certain tissues and cell types (e.g., placenta, endothelial 
cells, melanocytes, and embryonic tissues, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

30 fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment of developmental disorders; 
inhibition of angiogenesis; and vascular patterning. 

35 FEATURES OF PROTEIN ENCODED BY GENE NO: 127 

This gene is expressed primarily in endothelial cells and hematopoietic tissues, 
including spleen, tonsils, leukocytes, and both B- and T-cell lymphomas. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tumors of an endothelial 
cell and/or hematopoietic origin; leukemias and lymphomas. Similarly, polypeptides 

5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune and vascular systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., endothelial cells, hematopoietic cells, spleen, 

10 tonsils, and blood cells, and cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

1 5 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the manipulation of angiogenesis; the 
differentiation and morphogenesis of endothelial cells; the proliferation and/or 
differentiation of hematopoietic cells; and the commitment of hematopoietic cells to 
distinct cell lineages. 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 128 

This gene is expressed primarily in kidney medulla and to a lesser extent in 
spleen from chronic myelogenous leukemia patients, prostate cancer, and some other 
tissues. 

25 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tumors of a kidney 
origin; chromic myelogenous leukemia; prostate cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

30 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the kidney and spleen, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
and cell types (e.g., kidney, spleen, and prostate, and cancerous and wounded tissues) 
or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 

3 5 tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of kidney 
disorders and cancer, particularly chromic myelogenous leukemia and prostate cancer. 
It may also be useful for the enhancement of kidney tubule regeneration in the treatment 
5 of acute renal failure. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 129 

This gene is expressed primarily in adult and infant brain and to a lesser extent 
in mesenchymal or fibroblast cells, as well as tissues with a mesenchymal origin. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tumors of a brain and/or 
mesenchymal origin; neurodegenerative disorders; cancer; fibrosis. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

15 immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the brain and of 
mesenchymal cells and tissues, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues and cell types (e.g., brain and other 
tissue of the nervous system and cancerous and wounded tissues) or bodily fluids 

20 (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for the diagnosis of tumors of a brain and/or 

mesenchymal origin; neurodegenerative disorders; cancer; and fibrosis, based upon the 
expression of this gene within those tissues. Fibrosis is considered as mesenchymal 
cells and fibroblasts are the primary cellular targets involved in this pathological 
condition. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 130 

This gene is expressed primarily in hepatocellular cancer and to a lesser extent in 
fetal tissues as well as testes tumor. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: liver cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
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immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the digestive system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., liver, fetal tissue, and testes and other 
5 reproductive tissue, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

10 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment of liver cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 131 
This gene is expressed only in infant early brain. 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: development and 
diseases of the nervous system. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 

20 of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the brain and nervous system, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues and cell types (e.g., 
brain and other tissue of the nervous system and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 

25 tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating diseases of the brain in children and in 

30 treating nervous system disorders such as Alzheimer's disease, schizophrenia, 
dementia, depression, etc. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 132 

This gene is expressed primarily in brain and to a lesser extent in glioblastoma. 
35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: Alzheimer's disease, 
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schizophrenia, depression, mania, and dementia. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the brain and nervous system, expression of 
5 this gene at significantly higher or lower levels may be routinely detected in certain 

tissues and cell types (e.g., brain and other tissue of the nervous system, and cancerous 
and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or 
spinal fluid) or another tissue or cell sample taken from an individual having such a 
disorder, relative to the standard gene expression level, i.e., the expression level in 
10 healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating brain disorders such as Alzheimer's 
disease, schizophrenia, depression, mania, and dementia. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 133 

The translation product of this gene shares sequence homology with ribitol 

dehydrogenase of bacteria which is thought to be important in metabolism of sugars. 
This gene is expressed primarily in macrophage and to a lesser extent in T-cell 

lymphoma and lung. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tissue destruction in 
inflammation. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 

25 or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues and cell types (e.g., blood cells and lung, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

30 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to ribitol dehydrogenase indicate that 
polynucleotides and polypeptides corresponding to this gene are useful for altering 
macrophage metabolism in diseases such as inflammation where macrophages are 

35 causing excess tissue destruction. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 134 

This gene is expressed primarily in pancreatic tumor and to a lesser extent in 
synovial sarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to,. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

10 the endocrine and connective tissue systems, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues and cell types (e.g., 
pancreas, and synovial tissue, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 

15 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating and diagnosing various cancers. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 135 

This gene is expressed primarily in T cell lines such as Raji and to a lesser 
extent in infant brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions: immune system 

disorders and inflammation. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 

30 lower levels may be routinely detected in certain tissues and cell types (e.g., blood 
cells, and brain and other tissue of the nervous system, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 

35 fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating and diagnosing inflammatory diseases 
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such as rheumatoid arthritis, sepsis, inflammatory bowel disease, and psoriasis, as well 
as neutropenia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 136 

5 The translation product of this gene shares high sequence homology with SARI 

subfamily of GTP-binding proteins which is thought to be important in vesicular 
transport in mammalian cells. 

This gene is expressed primarily in serum-stimulated smooth muscle cells and to 
a lesser extent in a T-cell lymphoma. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: diseases affecting 
vesicular transport. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 

15 of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the muscular system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., blood 
cells, and smooth muscle, and cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

20 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to GTP-binding proteins indicate that 
polynucleotides and polypeptides corresponding to this gene are useful for gene therapy 

25 in treating the large number of diseases involved in defective vesicular transport within 
cells.. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 137 

The translation product of this gene shares sequence homology with a protein 
30 found in C. elegans cosmid F25B5. 

This gene is expressed primarily in a fetal tissues and to a lesser extent in 
melanocytes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions: abnormal fetal 
development, especially of the pulmonary system. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
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for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the fetal pulmonary system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues and cell types (e.g., fetal tissue, pulmonary tissue, and melanocytes, and 
5 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
10 corresponding to this gene are useful for treatment and diagnosis of diseases affecting 
the pulmonary system, such as emphysema. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 138 

This gene is expressed primarily in gall bladder and to a lesser extent in smooth 

15 muscle. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: digestive system disease 
and gall bladder problems. Similarly, polypeptides and antibodies directed to these 

20 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the digestive system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., gall 
bladder and tissue of the digestive system, and smooth muscle, and cancerous and 

25 wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

30 corresponding to this gene are useful for treating diseases of the digestive system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 139 

This gene is expressed primarily in placenta and to a lesser extent in brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: abnormal fetal 
development. Similarly, polypeptides and antibodies directed to these polypeptides are 
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useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
developing tissues, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues and cell types (e.g., placenta, and brain and other 
tissue of the nervous system, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating and diagnosing abnormal fetal 
development. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 140 

This gene is expressed primarily in smooth muscle and to a lesser extent in 
ovary, prostate cancer, and activated monocytes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: hypertension find 
atherosclerosis. Similarly, polypeptides and antibodies directed to these polypeptides 
are useful in providing immunological probes for differential identification of the 
tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the circulatory system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., smooth 
muscle, ovary and other reproductive tissue, prostate, and blood cells, and cancerous 
and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or 
spinal fluid) or another tissue or cell sample taken from an individual having such a 
disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating diseases of the circulatory system, 
such as hypertension, atherosclerosis, etc. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 141 

This gene is expressed primarily in fetal spleen and to a lesser extent in placenta 
and bone marrow. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: anemia and other 
diseases affecting blood cells. Similarly, polypeptides and antibodies directed to these 
5 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the circulatory and pulmonary systems, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., spleen, placenta, bone marrow, and blood cells, and cancerous and 

10 wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

15 corresponding to this gene are useful for the generation of red and white blood cells and 
for the diagnosis of disease of these cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 142 

The predicted translation product of this contig is a human homolog of the 
20 murine tetracycline/sugar transporter molecule recently reported by Matsuo and 
colleagues (Biochem. Biophys. Res. Commun. 238 (1), 126-129 (1997)). 

This gene is expressed primarily in synovium and to a lesser extent in 
endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: rheumatoid arthritis and 
inflammation. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
30 the immune and lymphatic systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., synovial 
tissue, and endothelial cells, and cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
35 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of inflammatory 
diseases, such as rheumatoid arthritis, leukemia, neutropenia, inflammatory bowel 
disease, psoriasis, sepsis, and the like. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 143 

This gene is expressed primarily in placenta and to a lesser extent in melanocyte, 
fetal liver and spleen, and bone marrow. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: abnormal early 
development. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, lower levels 

15 may be routinely detected in certain tissues and cell types (e.g., placenta, melanocytes, 
liver, spleen, and bone marrow, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

20 individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of abnormal early 
development phenomena and diseases. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 144 

This gene is expressed primarily in fetal liver and spleen. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: anemia and neutropenia. 

30 Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune and blood systems, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues and cell types (e.g., liver and spleen, 

35 and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 

synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
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expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful in hematopoeisis and bone marrow regeneration 
5 as it is most abundant in fetal tissues responsible for the generation of hernatopoeitic 
cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 145 

The translation product of this gene shares sequence homology with protein 

10 tyrosine phosphatase which is thought to be important in transducing signal to activate 
cells such as T cell, B cell and other cell types. 

This gene is expressed primarily in T cells and tissues in early stages of 
development and to a lesser extent in cancers. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: immuno-related 
diseases and cancer. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 

20 particularly of the immune system expression of this gene at significantly higher or 

lower levels may be routinely detected in certain tissues and cell types (e.g., embryonic 
and fetal tissue, undifferentiated cells, and blood cells, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

25 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to the protein tyrosine phosphatase family 
indicate that polynucleotides and polypeptides corresponding to this gene are useful for 
modulating the immune system. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 146 

This gene is expressed primarily in T cell and to a lesser extent in B cell, 

macrophages and tumor tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: immuno-disorders. 

Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
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providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., blood cells, and cancerous and 
5 wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
10 corresponding to this gene are useful for regulating the immune system therefore can be 
used in treating diseases such as autoimmune diseases and cancers. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 147 

This gene is expressed primarily in placenta and to a lesser extent in endothelial 

1 5 cells, testis tumor, ovarian cancer, uterine cancer. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to cancer. Similarly, polypeptides and antibodies directed to these 

20 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., placenta, 
endothelial cells, testis and ovary and other reproductive tissue, and cancerous and 

25 wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

30 corresponding to this gene are useful for diagnosis and treatment of cancers. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 148 

This sequence has significant homology to mouse torsin A. Recently, another 
group cloned the human Torsin A gene. (See, Accession No. 2358279; see also Nature 
35 Genet. 17,40-48 (1997).) 

This gene is expressed primarily in osteoclastoma, T-cell, and placenta and to a 
lesser extent in fetal lung, fetal liver, fetal brain, adult brain and tumor tissues 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: disease conditions in 
hematopoiesis and cancers. Similarly, polypeptides and antibodies directed to these 
5 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the hematopoiesis system, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues and cell types (e.g., blood 
cells, bone, placenta, lung, liver, and brain and other tissues of the nervous system, 
10 and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 

synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

15 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treating blood related diseases such as 
deficiencies in red blood cell, white blood cell, platelet and other hematopoiesis cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 149 

20 This gene is expressed primarily in T cell, prostate and prostate cancer, 

endothelial cells and to a lesser extent in monocyte, dendritic cell, bone marrow, 
salivary gland, colon cancer, stomach cancer, pancreatic tumor, uterine cancer, fetal 
spleen and osteoclastoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: immuno-related 
diseases and cancers. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 

30 particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., blood 
cells, prostate, endothelial cells, dendritic cells, bone marrow, salivary gland, colon, 
stomach, pancreas, uterus, spleen and bone, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 

35 tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment of cancers. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 150 

5 This gene was recently cloned by another group, calling it eIF3-p66. (See 

Accession No. 2351378.) This gene plays a role in RNA binding and macromolecular 
assembly, and therefore, any mutations in this gene would likely result in a diseased 
phenotype. Preferred polypeptide fragments comprise the amino acid sequence: 

MAKFMTPVIQDNPSGWGPCAVPEQFRDMPYQPFSKGDRLGKVADWTGATYQDKR 
10 QFGGGSQYAYPliEEDESSFQLVDTARTQKTAYQRNRMR^ 

KSAKQKERERIRLQKKHJKQFGVRQKWDQKSQKPRDSSVEVRSDWEVKEEMDFPQLM 

LEVSEPQDIECCGALEYYDKAFDRirrcSEKPLRXX 

AILATLMSCTRSVYSWDIWQRVGSKLFFDKR^ 

AMEATTINHNFSQQCLRMGKERYNFPNPNPF 
1 5 EHDG\WTGANGEVSFTMKTLNEWDSRHCNGVDW^ 

CALLAGSEYLKLGYVSRYHVKDSSRHVILGTQQF 

EEGKYLILKDPNKQVIRVYSLPDGTFSS (SEQ ID NO: 638), as well as N-terminal and C- 
terminal deletions of this polypeptide fragment. 

This gene is expressed primarily in T cell, bone marrow, embryo and 

20 endothelial cells and to a lesser extent in testis tumor and endometrial tumor. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: immune diseases and 
tumors. Similarly, polypeptides and antibodies directed to these polypeptides are useful 

25 in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune system and reproductive system, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues and cell types (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

30 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for immune disorders and cancers. 

35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 151 

This gene is expressed primarily in testis and to a lesser extent in T cell, spinal 
cord, placenta, neutrophil and monocyte. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: male reproductive and 
endocrine disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
10 particularly of the reproductive, immune and endocrine systems, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
and cell types (e.g., testis and other reproductive tissue, blood cells, tissue of the 
nervous system, and placenta, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
15 sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for regulating immune and reproductive functions. 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 152 

The translation product of this gene shares sequence homology with tyrosyl- 

tRNA synthetase which is thought to be important in cell growth. 

This gene is expressed primarily in brain, liver, keratinocytes, tonsils, and 

25 heart. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer autoimmune diseases. Similarly, polypeptides and antibodies 

30 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the brain, liver, keratinocytes, tonsils, heart 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., brain and other tissues of the nervous system, 

35 liver, keratinocytes, tonsils and heart, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
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gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to tyrosyl-tRNA synthetase indicate that 
polynucleotides and polypeptides corresponding to this gene are useful for modulating 
5 cell growth. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 153 

This gene is homologous to the Drosophila transcriptional regulator dre4. (See 
Accession No. 25 1 1745.) Dre4 is a gene required for steroidogenesis in Drosophila 
10 melanogaster and encodes a developmentally expressed homologue of the yeast 

transcriptional regulator CDC68. Preferred polypeptide fragments comprise the amino 
acid sequence: KKRHTDVQFTTEVGEITTO^ 

FIEKVEALTKEELEFEVPFRDLGFNGAPYRSTCLLQPTSSALVNATEWPPFVVTLDEVELIHFXR 

VQFHLKNFDMVIVYKDYSKKVT^ 
15 DPEGFFEQGGWSFL (SEQ ID NO: 639), as well as N-terminal and C-terminal deletions of 
this fragments. Also preferred are polynucleotide fragments encoding this polypeptide 
fragment. 

This gene is expressed primarily in fetal liver, spleen, placenta, lung, T cell, 
thyroid, testes. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: brain tumor, heart and 
liver diseases. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 

25 or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the fetal liver, spleen, placenta, lung, T cell, thyroid, testes expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., liver, spleen, placenta, lung, blood cells, thyroid, and testes and other 
reproductive tissue, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 

30 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 154 

This gene is expressed primarily in brain and to a lesser extent in fetal heart, 
testis, spleen, lung. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: heart, liver and spleen 
diseases, immunological diseases. Similarly, polypeptides and antibodies directed to 
5 these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the brain, fetal heart, testis, spleen, lung expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues and cell types (e.g., brain and other tissue of the nervous system, heart, testes 
10 and other reproductive tissue, spleen, and lung, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 155 

Activation of T cells through the T cell antigen receptor (TCR) results in the 
rapid tyrosine phosphorylation of a number of cellular proteins, one of the earliest being 
a 100 kDa protein. This gene is the human equivalent of murine valosin containing 

20 protein (VCP). VCP is a member of a family of ATP binding, homo-oligomeric 

proteins, and the mammalian homolog of Saccharomyces cerevisiae cdc48p, a protein 
essential to the completion of mitosis in yeast. Both endogenous and expressed murine 
VCP are tyrosine phosphorylated in response to T cell activation. Thus we have 
identified a novel component of the TCR mediated tyrosine kinase activation pathway 

25 that may provide a link between TCR activation and cell cycle control. 

This gene is expressed primarily in brain, liver, spleen, placenta. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

30 not limited to, cancer immunological disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the brain, liver, spleen, placenta expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 

35 tissues and cell types (e.g., brain and other tissue of the nervous system, liver, spleen, 
and placenta, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
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an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to VCR indicate that polynucleotides and 
5 polypeptides corresponding to this gene are useful for treating cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 156 

The translation product of this gene shares sequence homology with rat growth 
response protein which is thought to be important in cell growth. A group recently 

10 cloned the human homolog of this gene, calling it insulin induced protein 1 . (See 
Accession No. 2358269, see also, Genomics 43 (3), 278-284 (1997).) Preferred 
polypeptide fragments comprise the amino acid sequence: RSGLGLGITIAFLATLITQF 
LVYNGVYQ YTSPDFLYIRSWLPCIFFSGG VTVGNIGRQLAMGVPEKPHSD (SEQ ID NO: 640), 
as well as N-terminal and C-terminal deletions of this polypeptide fragment. Also 

15 preferred are polynucleotide fragments encoding these polypeptide fragments. 

This gene is expressed primarily in brain, liver, placenta, heart, spleen, 
lymphoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer immunological disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the brain, liver, placenta, heart, spleen. 

25 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., brain and other tissue of the nervous system, 
liver, placenta, heart, spleen, and lymphoid tissue, and cancerous and wounded tissues) 
or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 

30 standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to growth-response protein indicate that 
polynucleotides and polypeptides corresponding to this gene are useful for modulating 
cell growth. 



35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 157 

This gene is expressed primarily in Glioblastoma, endometrial tumor, 
lymphoma and pancreas tumor. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: Glioblastoma, 
Endometrial tumor, lymphoma and pancreas tumor. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

10 of the above tissues or cells, particularly of the immune, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., endometrium, lymphoid tissue, pancreas, and tissue of the nervous system, 
and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 

15 having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 158 

20 The translation product of this gene shares sequence homology with IGE 

receptor which is thought to be important in allergy and asthma. 
This gene is expressed primarily in T cell, and fetal liver. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions: allergy and asthma and 
other immunological disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune, expression of this gene at significantly higher or lower 

30 levels may be routinely detected in certain tissues and cell types (e.g., blood cells, and 
liver, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. 
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The tissue distribution and homology to IgE receptor indicate that 
polynucleotides and polypeptides corresponding to this gene are useful for allergy and 
asthma. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 159 

The translation product of this gene shares sequence homology with 
immunoglobin heavy chain which is thought to be important in immune response to the 
antigen. 

This gene is expressed primarily in activated neutrophil and to a lesser extent in 

1 0 activated T cell, monocyte and heart. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: infection , inflammation 
and cancer. Similarly, polypeptides and antibodies directed to these polypeptides are 

1 5 useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., blood cells, and heart, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

20 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to immunoglobin heavy chain variable 
region indicate that polynucleotides and polypeptides corresponding to this gene are 

25 useful for making the ligand to block specific antigen which cause certain disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 160 

The translation product of this gene shares sequence homology with mouse X 
inactive specific transcript protein which is thought to be important in X chromosome 
30 inactivation. 

This gene is expressed primarily in HSA172 cell and to a lesser extent in normal 
ovary tissue, ovarian cancer, frontal cortex and brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differentia] identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions: ovarian tumor, 

schizophrenia and other neurological disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
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differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune and neural system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues and cell types (e.g., ovary and other reproductive tissue, and brain and other 
5 tissue of the nervous system, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 
sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
10 The tissue distribution and homology to X inactive specific transcript protein 

indicate that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of reproductive system tumors and CNS tumors. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 161 
15 This gene is expressed primarily in adipose cell and to a lesser extent in liver 

and prostate. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: obesity and liver 

20 disorder. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the adipose cell, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., adipose cells, liver, and 

25 prostate, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

30 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment of obesity and liver disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 162 

The translation product of this gene shares sequence homology with yeast 
35 ubiquitin activating enzyme hornolog which is thought to be important in protein 
posttraslation processing. 
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This gene is expressed primarily in stromal cell and to a lesser extent in retina, 
H. Atrophic Endometrium, colon carcinoma and myeloid progenitor cell. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions: defects of stromal cell 
development, neuronal growth disorders and tumors. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune and neural system, expression 
10 of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues and cell types (e.g., retinal cells, endometrium, colon, and bone marrow, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
1 5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to ubiquitin-activating enzyme homolog 
indicate that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis or treatment of some type of tumors , fucosidosis and neuronal growth 
disorders. 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 163 

This gene is expressed primarily in primary breast cancer and 
hemangiopericytoma and to a lesser extent in adult brain and cerebellum. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: breast cancer, leukemia 
and cerebellum disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 

30 particularly of the immune system and neural system , expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., mammary tissue, brain and other tissue of the nervous system, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

35 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis or treatment of various tumors and 
disease involved in neural system. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 164 

The translation product of this gene shares sequence homology with proline rich 
proteins. Recently, another group has also cloned this gene, calling it CD84 leukocyte 
antigen, a new member of the Ig superfamily. (See Accession No. U82988, see also, 
Blood 90 (6), 2398-2405 (1997).) 

10 This gene is expressed primarily in Weizmann olfactory tissue and 

osteoclastoma and to a lesser extent in anergic T-cell. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: ostsis and immune 

15 disease. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., olfactory tissue, bone, and 

20 blood cells, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

25 The tissue distribution and homology to the Ig superfamily indicate that the 

protein product of this clone is useful for treatment of osteoporosis, autoimmune 
disease, and other immune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 165 

30 This gene is expressed primarily in atrophic endometrium and colon cancer and 

to a lesser extent in some fetal tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: tumors. Similarly, 

35 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the immune system, 
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expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., endometrium, colon, and fetal tissue, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
5 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of tumors, specifically 
endometrium and colon tumors. 

10 

FEATURES OF PROTEIN ENCODED BY GENE NO: 166 

This gene is expressed primarily in human primary breast cancer and to a lesser extent 
in activated monocyte. Although the predicted signal sequence is identified in Table 1, 
other upstream sequences are also relevant. Preferred polypeptide fragments comprise 

15 the amino acid sequence: VTQPKHLSASMGGSVEIPFSFYYPWELAXXPXVRISWRRGHFHG 
QSFYSTRPPSIHKDYVNRLFLNWTEGQESGFLRIS^RKEDQSVYFCRVELDTRRSG (SEQ ID 
NO: 64l) t as well as N-terminal and C-terminal deletions. Also preferred are 
polynucleotide fragments encoding these polypeptide fragments. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: breast cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the immune system, 

25 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues and cell types (e.g., mammary tissue, and blood cells, and cancerous 
and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or 
spinal fluid) or another tissue or cell sample taken from an individual having such a 
disorder, relative to the standard gene expression level, i.e., the expression level in 

30 healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis of breast cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 167 

35 This gene is expressed primarily in fetal tissues and to a lesser extent in adult 

lung. This gene has also been mapped to chromosomal location 9q34, and thus, can be 
used as a marker for linkage analysis for chromosome 9. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample. Similarly, polypeptides and antibodies directed to these polypeptides 
are useful in providing immunological probes for differential identification of the 
5 tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the embryo tissues, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues and cell types (e.g., fetal 
tissues, and lung, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
10 an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 168 

15 The translation product of this gene shares sequence homology with Ig Heavy 

Chain which is thought to be important in immune response. 

This gene is expressed primarily in prostate cancer tissue specifically 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions: prostate cancer. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
prostate, expression of this gene at significantly higher or lower levels may be routinely 

25 detected in certain tissues and cell types (e.g., prostate, tissue and cells of the immune 
system, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 169 

The translation product of this gene shares sequence homology with cytosolic 
acyl coenzyme-A hydrolase, which is thought to be important in neuron-specific fatty 
35 acid metabolism. The gene represented by this contig has since been published by Hajra 
and colleagues (GenBank Accession No. U91316). 
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This gene is expressed primarily in human pituitary gland and to a lesser extent 
in colorectal cancer tissue. This gene has also been observed in the LNCAP cell line. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions: hyperlipidemias of 
familial and/or idiopathic origins. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly blood, expression of this gene at significantly higher or 

10 lower levels may be routinely detected in certain tissues and cell types (e.g., pituitary 
and colon, and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

15 disorder. 

The tissue distribution and homology to rat cy tosolic acyl coenzyme-A 
hydrolase indicate that polynucleotides and polypeptides corresponding to this gene are 
useful for the detection or treatment of hyperlipidemia disease states by virtue of the 
ability of specific drugs to activate the enzyme. 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 170 

The translation product of this gene shares sequence homology with a 
Caenorhabditis elegans gene which is thought to be important in organism 
development. 

25 This gene is expressed primarily in human synovial sarcoma tissue, bone 

marrow, and to a lesser extent in human brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

30 not limited to, of bone, specifically synovial sarcoma. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the bone, connective tissues and possibly 
immune system, expression of this gene at significantly higher or lower levels may be 

35 routinely detected in certain tissues and cell types (e.g., synovial tissue, bone marrow, 
brain and other tissue of the nervous system, and cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
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tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to Caenorhabditis elegans indicate that 
5 polynucleotides and polypeptides corresponding to this gene are useful as a diagnostic 
and/or therapeutic modality directed at the detection and/or treatment of connective 
tissue sarcomas or other related bone diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 171 

10 The translation product of this gene shares sequence homology with betal - 

6GlcNAc transferase which is thought to be important in the transfer and metabolism of 
beta 1-6, N-acetylglucosamine. This gene product has previously been shown to 
suppress melanoma lung metastasis in both syngeneic and nude mice, decreased 
invasiveness into the matrigel, and inhibition of cell attachment to collagen and laminin 

1 5 without affecting cell growth. 

This gene is expressed primarily in human testes and prostate tissues, and to a 
lesser extent in kidney, medulla, and pancreas. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer particularly melanoma. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune system, expression of this gene at 

25 significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., testes and other reproductive tissue, prostate, kidney, pancreas, brain and 
other tissue of the nervous system, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

30 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to betal-6GlcNAc transferase indicate that 
the protein product of this clone is useful for the development of diagnostic and/or 
therapeutic modalities directed at the detection and/or treatment of cancer, the metastasis 

35 of malignant tissue or cells. Defects in this potentially secreted enzyme may play a role 
in metastasis. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 172 

This gene is expressed primarily in fetal spleen and liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: immune disorders, 
Wilm's tumor disease, hepatic disorders, and hematopoietic disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 
10 number of disorders of the above tissues or cells, particularly of the hematopoiesis and 
immune systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., spleen and liver, and cancerous 
and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or 
spinal fluid) or another tissue or cell sample taken from an individual having such a 
15 disorder, relative to the standard gene expression level, i.e., the expression level in 
healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and identification of fetal defects 
along with correcting diseases that affect hematopoiesis and the immune system. 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 173 

The translation product of this gene shares sequence homology with ret II 
oncogene which is thought to be important in Hirschsprung disease and many types of 
cancers. 

25 This gene is expressed in multiple tissues including the lymphatic system, brain, 

and thyroid. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for identification of the tissue(s) or cell type(s) present in a biological sample 
and for diagnosis of diseases and conditions: Hirschsprung disease and multiple 

30 cancers. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune and 
central nervous system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues and cell types (e.g., lymphoid tissue, 

35 thyroid, and brain and other tissue of the nervous system, and cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
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the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to ret II oncogene indicate that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
5 and treatment of various cancers. It would also be useful for the diagnosis and treatment 
of Hirschsprung disease. Preferred polypeptides of the invention comprise the amino 
acid sequence: MEAQQVNTEAESAREQLQXLHDQIAGQKASKQELETELERLKQEFHYIEEDLY 

RTKNTLQSRIKDRDEEIQKLRNQ^ 

VFQLERLEQQMNSASGSSSNGSSINMSGIDNGEGTRLRKVPVLFNDTETNLAGMY 
10 SIDQFSIRLGIFLRRYPIARVFVIIYMALLHLWVMIVLLTYTPEM HHDQPYGK (SEQ ID NO: 
642). 

FEATURES OF PROTEIN ENCODED BY GENE NO: 174 

The translation product of this gene shares sequence homology with testis 
15 enhanced gene transcript which is thought to be important in regulation of human 
development. 

This gene is expressed primarily in infant brain and to a lesser extent in a variety 
of other tissues and cell types, including the prostate, testes, monocytes, macrophages, 
dendritic cells, keratinocytes, and adipocytes. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: neurological, 
developmental, immune and inflammation disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

25 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the brain and immune systems, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues and cell types (e.g., brain and other tissue of the nervous system, prostate, 
testes and other reproductive tissue, blood cells, keratinocytes, and adipocytes, and 

30 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to testis enhanced gene transcript indicate 

35 that the protein product of this clone is useful for diagnosis and treatment of disorders 
involving the developing brain and the immune system. 



WO 98/39448 



109 



PCT/US98/04493 



FEATURES OF PROTEIN ENCODED BY GENE NO: 175 

This gene is expressed primarily in prostate and to a lesser extent in various 
other tissues, including placenta. 
5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancers, especially of the prostate. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 

10 differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the prostate, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 
types (e.g., prostate and placenta, and cancerous and wounded tissues) or bodily fluids 
(e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell 

15 sample taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that the protein product of this clone is useful 
for diagnosis and treatment of prostate disorders and cancer. It may also be useful for 

20 the diagnosis and treatment of endocrine disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 176 

The translation product of this gene shares sequence homology with 
Sacchromyces cerevisiae YNT20 gene which is thought to be important in 

25 mitochondrial function. 

This gene is expressed at a particularly high level in muscle tissue. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases related to such tissues and cell types 

30 including: muscle wasting diseases. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the neuromuscular system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 

35 types (e.g., muscle and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to the YNT20 gene indicate that this 
protein is useful for treatment and detection of neuromuscular diseases caused by loss 
5 of mitochondrial function. For example this gene or its protein product could be used in 
replacement therapy for such diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 177 

This gene is expressed primarily in the brain and to a lesser extent in kidney, 

10 placenta, smooth muscle, heart and lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions: neuromuscular 
diseases, degenerative diseases of the central nervous system, and heart disease. 

15 Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
neuromuscular system, central nervous system, and heart, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues and cell 

20 types (e.g., brain and other tissue of the nervous system, kidney, placenta, muscle, 
heart and lung, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

25 disorder. 

This gene or its protein product could also be used for replacement therapy for 
the above mentioned diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 178 

30 The translation product of this gene shares sequence homology with caldesmon 

which is thought to be important in the cellular response to changes in glucose levels. 
This gene is expressed primarily in multiple tissues including brain and retina. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for identification of the tissue(s) or cell type(s) present in a biological sample 
35 and for diagnosis of diseases and conditions: central nervous system disorders and 
retinopathy. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for identification of the tissue(s) or cell 
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type(s). For a number of disorders of the above tissues or cells, particularly of the CNS 
disorders and retinopathy, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues and cell types (e.g., brain and other tissue of 
the nervous system, and retinal tissue, and cancerous and wounded tissues) or bodily 
5 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to caldesmon indicate that polynucleotides 
10 and polypeptides corresponding to this gene are useful for treatment of retinopathies. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 179 

The translation product of this gene shares sequence homology with mouse 
fibrosin protein which is thought to be important in regulation of fibrinogenesis in 

1 5 certain chronic inflammatory diseases. 

This gene is expressed primarily in amniotic cells and breast tissue. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of breast cancer and abnormal embryo 

20 development. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the reproductive system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues and cell types (e.g., amniotic cells, and 

25 mammary tissue, and cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

30 The tissue distribution and homology to fibrosin indicate that the protein product 

of this clone is useful for treatment of breast cancer. This gene or its protein product 
could be used in replacement therapy for breast cancer. In addition the protein product 
of this gene is useful in the treatment of chronic inflammatory diseases. 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 180 

This gene is expressed several infant tissues including brain and liver and 
various adult tissues including brain, lung, liver, testes, and prostate. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, brain cancer, lung cancer, liver cancer and cancers of the reproductive 

5 system. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
central nervous system, hepatic system, and reproductive system, expression of this 
gene at significantly higher of lower levels may be routinely detected in certain tissues 

10 and cell types (e.g. , brain and other tissue of the nervous system, lung, liver, testes and 
other reproductive tissue, and prostate, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

15 individual not having the disorder. 

The tissue distribution of this gene product indicates that the protein product of 
this clone is involved in growth regulation and could be used as a growth factor or 
growth blocker in a variety of settings including treatment of cancers. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 181 

This gene is expressed primarily in activated monocytes and to a lesser extent in 
melanocytes and dendritic cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
25 biological sample and for diagnosis of immune system diseases and cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a 
number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
30 in certain tissues and cell types (e.g., blood cells, melanocytes, and dendritic cells, and 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 
35 The tissue distribution indicates that the protein product of this clone could be 

involved in growth regulation and could be used as a growth factor or growth blocker 
in a variety of settings including treatment of cancers. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 182 

This gene is expressed primarily in placenta and several tumors of various tissue 
origin and to a lesser extent in normal tissues including liver, lung, brain, and skin, 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of cancers of all kinds. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

10 of the above tissues or cells, particularly of the central nervous system, respiratory 
system and skin, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues and cell types (e.g., liver, lung, brain and other 
tissues of the nervous system, and skin, and cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 

15 cell sample taken from an individual having such a disorder, relative to the standard 

gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The high expression of this gene in multiple tumors indicates that the protein 
product of the clone may be involved in cell growth control and therefore would be 

20 useful for treatment of certain cancers. Likewise molecules developed to block the 
activity of the protein product of this clone could be used to block its potential role in 
tumor growth promotion. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 183 
25 The translation product of this gene shares sequence homology with the mouse 

Ndrl gene which is thought to be important in cancer progression. 

This gene is expressed multiple cell types and tissues including brain, lung, 
kidney, bone marrow, liver, and spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
30 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of all types of cancers. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the nervous, immune, and endocrine 
35 systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues and cell types (e.g., brain and other tissue of the nervous 
system, lung, kidney, bone marrow, liver and spleen, and cancerous and wounded 
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tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 
5 The tissue distribution and homology to Ndrl gene, which is thought to be 

involved in cancer progression, indicate that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment of certain cancers. Likewise 
molecules developed to block the activity of the protein product of this clone could be 
used to block its potential role in tumor growth promotion. 

10 

FEATURES OF PROTEIN ENCODED BY GENE NO: 184 

This gene is expressed primarily in early stage human brain and liver and to a 
lesser extent in several other fetal tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

1 5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions: brain and liver cancers. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 

20 central nervous system and immune system, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues and cell types (e.g., 
brain and other tissue of the nervous system, liver, and fetal tissue, and cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 

25 relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The expression of this gene in embryonic tissues indicates that the protein could 
be involved in growth regulation and could be used as a growth factor or growth 
blocker in a variety of settings including treatment of cancers. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 185 

This gene is expressed primarily in infant and embryonic brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of degenerative nervous system disorders and brain 
cancer. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
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type(s). For a number of disorders of the above tissues or cells, particularly of the 
central nervous system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues and cell types (e.g., embryonic tissue, brain 
and other tissue of the nervous system, and cancerous and wounded tissues) or bodily 

5 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The expression of this gene in embryonic tissues indicates that the protein could 

10 be involved in growth regulation and could be used as a growth factor or growth 
blocker in a variety of settings including treatment of cancers. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 186 

This gene is expressed primarily in multiple tissues including placenta, fetal 

15 lung, fetal liver, and brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of all types of cancers including liver, brain and 
lung. Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

20 providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
central nervous system, pulmonary system, and hepatic system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues and 
cell types (e.g., placenta, lung, liver, and brain and other tissue of the nervous system, 

25 and cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 

synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

30 The expression of this gene in embryonic tissues indicates that the protein could 

be involved in growth regulation and could be used as a growth factor or growth 
blocker in a variety of settings including treatment of cancers. 
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Table 1 summarizes the information corresponding to each "Gene No." 
described above. The nucleotide sequence identified as "NT SEQ ID NO:X" was 
assembled from partially homologous ("overlapping") sequences obtained from the 
"cDNA clone ID" identified in Table 1 and, in some cases, from additional related DNA 
5 clones. The overlapping sequences were assembled into a single contiguous sequence 
of high redundancy (usually three to five overlapping sequences at each nucleotide 
position), resulting in a final sequence identified as SEQ ID NO:X. 

The cDNA Clone ID was deposited on the date and given the corresponding 
deposit number listed in "ATCC Deposit No:Z and Date." Some of the deposits contain 
10 multiple different clones corresponding to the same gene. "Vector" refers to the type of 
vector contained in the cDN A Clone ID. 

'Total NT Seq " refers to the total number of nucleotides in the contig identified 
by "Gene No." The deposited clone may contain all or most of these sequences, 
reflected by the nucleotide position indicated as "5' NT of Clone Seq." and the "3' NT 
1 5 of Clone Seq." of SEQ ID NO:X. The nucleotide position of SEQ ID NO:X of the 
putative start codon (methionine) is identified as "5' NT of Start Codon." Similarly , 
the nucleotide position of SEQ ID NO:X of the predicted signal sequence is identified as 
"5' NT of First AA of Signal Pep." 

The translated amino acid sequence, beginning with the methionine, is identified 
20 as "AA SEQ ID NO: Y," although other reading frames can also be easily translated 
using known molecular biology techniques. The polypeptides produced by these 
alternative open reading frames are specifically contemplated by the present invention. 

The first and last amino acid position of SEQ ID NO: Y of the predicted signal 
peptide is identified as "First AA of Sig Pep" and "Last AA of Sig Pep." The predicted 
25 first amino acid position of SEQ ID NO: Y of the secreted portion is identified as 

"Predicted First AA of Secreted Portion." Finally, the amino acid position of SEQ ID 
NO: Y of the last amino acid in the open reading frame is identified as "Last AA of 
ORF." 

SEQ ID NO:X and the translated SEQ ID NO: Y are sufficiently accurate and 
30 otherwise suitable for a variety of uses well known in the art and described further 

below. For instance, SEQ ID NO:X is useful for designing nucleic acid hybridization 
probes that will detect nucleic acid sequences contained in SEQ ID NO:X or the cDNA 
contained in the deposited clone. These probes will also hybridize to nucleic acid 
molecules in biological samples, thereby enabling a variety of forensic and diagnostic 
35 methods of the invention. Similarly, polypeptides identified from SEQ ID NO: Y may 
be used to generate antibodies which bind specifically to the secreted proteins encoded 
by the cDNA clones identified in Table 1. 
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Nevertheless, DNA sequences generated by sequencing reactions can contain 
sequencing errors. The errors exist as misidentified nucleotides, or as insertions or 
deletions of nucleotides in the generated DNA sequence. The erroneously inserted or 
deleted nucleotides cause frame shifts in the reading frames of the predicted amino acid 
5 sequence. In these cases, the predicted amino acid sequence diverges from the actual 
amino acid sequence, even though the generated DNA sequence may be greater than 
99.9% identical to the actual DNA sequence (for example, one base insertion or deletion 
in an open reading frame of over 1000 bases). 

Accordingly, for those applications requiring precision in the nucleotide 

10 sequence or the amino acid sequence, the present invention provides not only the 

generated nucleotide sequence identified as SEQ ID NO:X and the predicted translated 
amino acid sequence identified as SEQ ID NO: Y, but also a sample of plasmid DNA 
containing a human cDNA of the invention deposited with the ATCC, as set forth in 
Table 1 . The nucleotide sequence of each deposited clone can readily be determined by 

15 sequencing the deposited clone in accordance with known methods. The predicted 
amino acid sequence can then be verified from such deposits. Moreover, the amino 
acid sequence of the protein encoded by a particular clone can also be directly 
determined by peptide sequencing or by expressing the protein in a suitable host cell 
containing the deposited human cDNA, collecting the protein, and determining its 

20 sequence. 

The present invention also relates to the genes corresponding to SEQ ID NO:X, 
SEQ ID NO:Y, or the deposited clone. The corresponding gene can be isolated in 
accordance with known methods using the sequence information disclosed herein. 
Such methods include preparing probes or primers from the disclosed sequence and 
25 identifying or amplifying the corresponding gene from appropriate sources of genomic 
material. 

Also provided in the present invention are species homologs. Species 
homologs may be isolated and identified by making suitable probes or primers from the 
sequences provided herein and screening a suitable nucleic acid source for the desired 
30 homologue. 

The polypeptides of the invention can be prepared in any suitable manner. Such 
polypeptides include isolated naturally occurring polypeptides, recombinantly produced 
polypeptides, synthetically produced polypeptides, or polypeptides produced by a 
combination of these methods. Means for preparing such polypeptides are well 
35 understood in the art. 

The polypeptides may be in the form of the secreted protein, including the 
mature form, or may be a part of a larger protein, such as a fusion protein (see below). 
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It is often advantageous to include an additional amino acid sequence which contains 
secretory or leader sequences, pro-sequences, sequences which aid in purification , 
such as multiple histidine residues, or an additional sequence for stability during 
recombinant production. 
5 The polypeptides of the present invention are preferably provided in an isolated 

form, and preferably are substantially purified. A recombinantly produced version of a 
polypeptide, including the secreted polypeptide, can be substantially purified by the 
one-step method described in Smith and Johnson, Gene 67:31-40 (1988). 
Polypeptides of the invention also can be purified from natural or recombinant sources 
1 0 using antibodies of the invention raised against the secreted protein in methods which 
are well known in the art. 

Signal Sequences 

Methods for predicting whether a protein has a signal sequence, as well as the 

1 5 cleavage point for that sequence, are available. For instance, the method of McGeoch, 
Virus Res. 3:271-286 (1985), uses the information from a short N-terminal charged 
region and a subsequent uncharged region of the complete (uncleaved) protein. The 
method of von Heinje, Nucleic Acids Res. 14:4683-4690 (1986) uses the information 
from the residues surrounding the cleavage site, typically residues -13 to +2, where +1 

20 indicates the amino terminus of the secreted protein. The accuracy of predicting the 

cleavage points of known mammalian secretory proteins for each of these methods is in 
the range of 75-80%. (von Heinje, supra.) However, the two methods do not always 
produce the same predicted cleavage point(s) for a given protein. 

In the present case, the deduced amino acid sequence of the secreted polypeptide 

25 was analyzed by a computer program called SignalP (Henrik Nielsen et al., Protein 
Engineering 10: 1-6 (1997)), which predicts the cellular location of a protein based on 
the amino acid sequence. As part of this computational prediction of localization, the 
methods of McGeoch and von Heinje are incorporated. The analysis of the amino acid 
sequences of the secreted proteins described herein by this program provided the results 

30 shown in Table 1. 

As one of ordinary skill would appreciate, however, cleavage sites sometimes 
vary from organism to organism and cannot be predicted with absolute certainty. 
Accordingly, the present invention provides secreted polypeptides having a sequence 
shown in SEQ ID NO:Y which have an N-terminus beginning within 5 residues (i.e., + 

35 or - 5 residues) of the predicted cleavage point. Similarly, it is also recognized that in 
some cases, cleavage of the signal sequence from a secreted protein is not entirely 
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uniform, resulting in more than one secreted species. These polypeptides, and the 
polynucleotides encoding such polypeptides, are contemplated by the present invention. 

Moreover, the signal sequence identified by the above analysis may not 
necessarily predict the naturally occurring signal sequence. For example, the naturally 
5 occurring signal sequence may be further upstream from the predicted signal sequence. 
However, it is likely that the predicted signal sequence will be capable of directing the 
secreted protein to the ER. These polypeptides, and the polynucleotides encoding such 
polypeptides, are contemplated by the present invention. 

10 Polynucleotide and Polypeptide Variants 

"Variant" refers to a polynucleotide or polypeptide differing from the 
polynucleotide or polypeptide of the present invention, but retaining essential properties 
thereof. Generally, variants are overall closely similar, and, in many regions, identical 
to the polynucleotide or polypeptide of the present invention. 

1 5 "Identity" per se has an art-recognized meaning and can be calculated using 

published techniques. (See, e.g.: (COMPUTATIONAL MOLECULAR BIOLOGY, 
Lesk, A.M., ed., Oxford University Press, New York, (1988); BIOCOMPUTING: 
INFORMATICS AND GENOME PROJECTS, Smith, D.W., ed., Academic Press, 
New York, (1993); COMPUTER ANALYSIS OF SEQUENCE DATA, PART I, 

20 Griffin, A.M., and Griffin, H.G., eds., Humana Press, New Jersey, (1994); 

SEQUENCE ANALYSIS IN MOLECULAR BIOLOGY, von Heinje, G., Academic 
Press, (1987); and SEQUENCE ANALYSIS PRIMER, Gribskov, M. and Devereux, 
J., eds., M Stockton Press, New York, (1991).) While there exists a number of 
methods to measure identity between two polynucleotide or polypeptide sequences, the 

25 term "identity" is well known to skilled artisans. (Carillo, H., and Lipton, D„ SIAM J 
Applied Math 48: 1073 (1988).) Methods commonly employed to determine identity or 
similarity between two sequences include, but are not limited to, those disclosed in 
"Guide to Huge Computers," Martin J. Bishop, ed., Academic Press, San Diego, 
(1994), and Carillo, H., and Lipton, D., SIAM J Applied Math 48:1073 (1988). 

30 Methods for aligning polynucleotides or polypeptides are codified in computer 

programs, including the GCG program package (Devereux, J., et al., Nucleic Acids 
Research (1984) 12(1):387 (1984)), BLASTP, BLASTN, FASTA (Atschul, S.F. et 
al., J. Molec. Biol. 215:403 (1990), Bestfit program (Wisconsin Sequence Analysis 
Package, Version 8 for Unix, Genetics Computer Group, University Research Park, 

35 575 Science Drive, Madison, WI 5371 1 (using the local homology algorithm of Smith 
and Waterman, Advances in Applied Mathematics 2:482-489 (1981).) 
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When using any of the sequence alignment programs to determine whether a 
particular sequence is, for instance, 95% identical to a reference sequence, the 
parameters are set so that the percentage of identity is calculated over the full length of 
the reference polynucleotide and that gaps in identity of up to 5% of the total number of 
5 nucleotides in the reference polynucleotide are allowed. 

A preferred method for determing the best overall match between a query 
sequence (a sequence of the present invention) and a subject sequence, also referred to 
as a global sequence alignment, can be determined using the FASTDB computer 
program based on the algorithm of Brutlag et al. (Comp. App. Biosci. 6:237-245 

10 (1990).) The term "sequence" includes nucleotide and amino acid sequences. In a 
sequence alignment the query and subject sequences are either both nucleotide 
sequences or both amino acid sequences. The result of said global sequence alignment 
is in percent identity. Preferred parameters used in a FASTDB search of a DNA 
sequence to calculate percent identiy are: Matrix=Unitary, k-tuple=4, Mismatch 

15 Penal ty= 1 , Joining Penalty =30, Randomization Group Length=0, and Cutoff Score=l , 
Gap Penalty=5, Gap Size Penalty 0.05, and Window Size=500 or query sequence 
length in nucleotide bases, whichever is shorter. Preferred parameters employed to 
calculate percent identity and similarity of an amino acid alignment are: Matrix=PAM 
150, k-tuple=2, Mismatch Penalty=l, Joining Penalty=20, Randomization Group 

20 Length=0, Cutoff Score=l, Gap Penalty=5, Gap Size Penalty=0.05, and Window 
Size=500 or query sequence length in amino acid residues, whichever is shorter. 

As an illustration, a polynucleotide having a nucleotide sequence of at least 95% 
"identity" to a sequence contained in SEQ ID NO:X or the cDN A contained in the 
deposited clone, means that the polynucleotide is identical to a sequence contained in 

25 SEQ ID NO:X or the cDNA except that the polynucleotide sequence may include up to 
five point mutations per each 100 nucleotides of the total length (not just within a given 
100 nucleotide stretch). In other words, to obtain a polynucleotide having a nucleotide 
sequence at least 95% identical to SEQ ID NO:X or the deposited clone, up to 5% of the 
nucleotides in the sequence contained in SEQ ID NO:X or the cDNA can be deleted, 

30 inserted, or substituted with other nucleotides. These changes may occur anywhere 
throughout the polynucleotide. 

Further embodiments of the present invention include polynucleotides having at 
least 85% identity, more preferably at least 90% identity, and most preferably at least 
95%, 96%, 97%, 98% or 99% identity to a sequence contained in SEQ ID NO:X or the 

35 cDNA contained in the deposited clone. Of course, due to the degeneracy of the genetic 
code, one of ordinary skill in the art will immediately recognize that a large number of 
the polynucleotides having at least 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity 
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will encode a polypeptide identical to an amino acid sequence contained in SEQ ID 
NO: Y or the expressed protein produced by the deposited clone. 

Similarly, by a polypeptide having an amino acid sequence having at least, for 
example, 95% "identity" to a reference polypeptide, is intended that the amino acid 
5 sequence of the polypeptide is identical to the reference polypeptide except that the 
polypeptide sequence may include up to five amino acid alterations per each 100 amino 
acids of the total length of the reference polypeptide. In other words, to obtain a 
polypeptide having an amino acid sequence at least 95% identical to a reference amino 
acid sequence, up to 5% of the amino acid residues in the reference sequence may be 

10 deleted or substituted with another amino acid, or a number of amino acids up to 5% of 
the total amino acid residues in the reference sequence may be inserted into the reference 
sequence. These alterations of the reference sequence may occur at the amino or 
carboxy terminal positions of the reference amino acid sequence or anywhere between 
those terminal positions, interspersed either individually among residues in the 

15 reference sequence or in one or more contiguous groups within the reference sequence. 

Further embodiments of the present invention include polypeptides having at 
least 80% identity, more preferably at least 85% identity, more preferably at least 90% 
identity, and most preferably at least 95%, 96%, 97%, 98% or 99% identity to an 
amino acid sequence contained in SEQ ID NO: Y or the expressed protein produced by 

20 the deposited clone. Preferably, the above polypeptides should exhibit at least one 
biological activity of the protein. 

In a preferred embodiment, polypeptides of the present invention include 
polypeptides having at least 90% similarity, more preferably at least 95% similarity, and 
still more preferably at least 96%, 97%, 98%, or 99% similarity to an amino acid 

25 sequence contained in SEQ ID NO: Y or the expressed protein produced by the 
deposited clone. 

The variants may contain alterations in the coding regions, non-coding regions, 
or both. Especially preferred are polynucleotide variants containing alterations which 
produce silent substitutions, additions, or deletions, but do not alter the properties or 

30 activities of the encoded polypeptide. Nucleotide variants produced by silent 
substitutions due to the degeneracy of the genetic code are preferred. Moreover, 
variants in which 5-10, 1-5, or 1-2 amino acids are substituted, deleted, or added in any 
combination are also preferred. Polynucleotide variants can be produced for a variety 
of reasons, e.g., to optimize codon expression for a particular host (change codons in 

35 the human mRNA to those preferred by a bacterial host such as E. coli). 

Naturally occurring variants are called "allelic variants," and refer to one of 
several alternate forms of a gene occupying a given locus on a chromosome of an 
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organism. (Genes II, Lewin, B M ed., John Wiley & Sons, New York (1985).) These 
allelic variants can vary at either the polynucleotide and/or polypeptide level. 
Alternatively, non-naturally occurring variants may be produced by mutagenesis 
techniques or by direct synthesis. 
5 Using known methods of protein engineering and recombinant DNA 

technology, variants may be generated to improve or alter the characteristics of the 
polypeptides of the present invention. For instance, one or more amino acids can be 
deleted from the N-terminus or C-terminus of the secreted protein without substantial 
loss of biological function. The authors of Ron et al., J. Biol. Chem. 268: 2984-2988 

10 (1993), reported variant KGF proteins having heparin binding activity even after 

deleting 3, 8, or 27 amino-terminal amino acid residues. Similarly, Interferon gamma 
exhibited up to ten times higher activity after deleting 8-10 amino acid residues from the 
carboxy terminus of this protein. (Dobeli et aL, J. Biotechnology 7: 199-216 (1988).) 
Moreover, ample evidence demonstrates that variants often retain a biological 

15 activity similar to that of the naturally occurring protein. For example, Gayle and 

coworkers (J. Biol. Chem 268:22105-221 1 1 (1993)) conducted extensive mutational 
analysis of human cytokine DL-la. They used random mutagenesis to generate over 
3,500 individual IL-la mutants that averaged 2.5 amino acid changes per variant over 
the entire length of the molecule. Multiple mutations were examined at every possible 

20 amino acid position. The investigators found that "[m]ost of the molecule could be 
altered with little effect on either [binding or biological activity]." (See, Abstract.) In 
fact, only 23 unique amino acid sequences, out of more than 3,500 nucleotide 
sequences examined, produced a protein that significantly differed in activity from wild- 
type. 

25 Furthermore, even if deleting one or more amino acids from the N-terminus or 

C-terminus of a polypeptide results in modification or loss of one or more biological 
functions, other biological activities may still be retained. For example, the ability of a 
deletion variant to induce and/or to bind antibodies which recognize the secreted form 
will likely be retained when less than the majority of the residues of the secreted form 

30 are removed from the N-terminus or C-terminus. Whether a particular polypeptide 
lacking N- or C-terminal residues of a protein retains such immunogenic activities can 
readily be determined by routine methods described herein and otherwise known in the 
art. 

Thus, the invention further includes polypeptide variants which show 
35 substantial biological activity. Such variants include deletions, insertions, inversions, 
repeats, and substitutions selected according to general rules known in the art so as 
have little effect on activity. For example, guidance concerning how to make 
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phenotypically silent amino acid substitutions is provided in Bowie, J. U. et al., 
Science 247:1306-1310 (1990), wherein the authors indicate that there are two main 
strategies for studying the tolerance of an amino acid sequence to change. 

The first strategy exploits the tolerance of amino acid substitutions by natural 
5 selection during the process of evolution. By comparing amino acid sequences in 
different species, conserved amino acids can be identified. These conserved amino 
acids are likely important for protein function. In contrast, the amino acid positions 
where substitutions have been tolerated by natural selection indicates that these 
positions are not critical for protein function. Thus, positions tolerating amino acid 

10 substitution could be modified while still maintaining biological activity of the protein. 

The second strategy uses genetic engineering to introduce amino acid changes at 
specific positions of a cloned gene to identify regions critical for protein function. For 
example, site directed mutagenesis or alanine-scanning mutagenesis (introduction of 
single alanine mutations at every residue in the molecule) can be used. (Cunningham 

15 and Wells, Science 244: 1081-1085 (1989).) The resulting mutant molecules can then 
be tested for biological activity. 

As the authors state, these two strategies have revealed that proteins are 
surprisingly tolerant of amino acid substitutions. The authors further indicate which 
amino acid changes are likely to be permissive at certain amino acid positions in the 

20 protein. For example, most buried (within the tertiary structure of the protein) amino 
acid residues require nonpolar side chains, whereas few features of surface side chains 
are generally conserved. Moreover, tolerated conservative amino acid substitutions 
involve replacement of the aliphatic or hydrophobic amino acids Ala, Val, Leu and He; 
replacement of the hydroxyl residues Ser and Thr; replacement of the acidic residues 

25 Asp and Glu; replacement of the amide residues Asn and Gin, replacement of the basic 
residues Lys, Arg, and His; replacement of the aromatic residues Phe, Tyr, and Trp, 
and replacement of the small-sized amino acids Ala, Ser, Thr, Met, and Gly. 

Besides conservative amino acid substitution, variants of the present invention 
include (i) substitutions with one or more of the non-conserved amino acid residues, 

30 where the substituted amino acid residues may or may not be one encoded by the 
genetic code, or (ii) substitution with one or more of amino acid residues having a 
substituent group, or (iii) fusion of the mature polypeptide with another compound, 
such as a compound to increase the stability and/or solubility of the polypeptide (for 
example, polyethylene glycol), or (iv) fusion of the polypeptide with additional amino 

35 acids, such as an IgG Fc fusion region peptide, or leader or secretory sequence, or a 
sequence facilitating purification. Such variant polypeptides are deemed to be within 
the scope of those skilled in the art from the teachings herein. 
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For example, polypeptide variants containing amino acid substitutions of 
charged amino acids with other charged or neutral amino acids may produce proteins 
with improved characteristics, such as less aggregation. Aggregation of pharmaceutical 
formulations both reduces activity and increases clearance due to the aggregate's 
5 immunogenic activity. (Pinckard et al., Clin. Exp. Immunol. 2:331-340 (1967); 
Robbins et al., Diabetes 36: 838-845 (1987); Cleland et al„ Crit. Rev. Therapeutic 
Drug Carrier Systems 10:307-377 (1993).) 

Polynucleotide and Polypeptide Fragments 

10 In the present invention, a "polynucleotide fragment" refers to a short 

polynucleotide having a nucleic acid sequence contained in the deposited clone or 
shown in SEQ ID NO:X. The short nucleotide fragments are preferably at least about 
15 nt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, 
and even more preferably, at least about 40 nt in length. A fragment "at least 20 nt in 

15 length," for example, is intended to include 20 or more contiguous bases from the 
cDNA sequence contained in the deposited clone or the nucleotide sequence shown in 
SEQ ID NO:X. These nucleotide fragments are useful as diagnostic probes and primers 
as discussed herein. Of course, larger fragments (e.g., 50, 150, 500, 600, 2000 
nucleotides) are preferred. 

20 Moreover, representative examples of polynucleotide fragments of the 

invention, include, for example, fragments having a sequence from about nucleotide 
number 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-400, 401- 
450, 451-500, 501-550, 551-600, 651-700, and 701 to the end of SEQ ID NO:X or the 
cDNA contained in the deposited clone. In this context "about" includes the particularly 

25 recited ranges, larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either 

terminus or at both termini. Preferably, these fragments encode a polypeptide which 
has biological activity. 

In the present invention, a "polypeptide fragment" refers to a short amino acid 
sequence contained in SEQ ID NO: Y or encoded by the cDNA contained in the 

30 deposited clone. Protein fragments may be "free-standing," or comprised within a 
larger polypeptide of which the fragment forms a part or region, most preferably as a 
single continuous region. Representative examples of polypeptide fragments of the 
invention, include, for example, fragments from about amino acid number 1-20, 21-40, 
41-60,61-80,81-100, 102-120, 121-140, 141-160, and 161 to the end of the coding 

35 region. Moreover, polypeptide fragments can be about 20, 30, 40, 50, 60, 70, 80, 90, 
100, 1 10, 120, 130, 140, or 150 amino acids in length. In this context "about" 
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includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1 ) 
amino acids, at either extreme or at both extremes. 

Preferred polypeptide fragments include the secreted protein as well as the 
mature form. Further preferred polypeptide fragments include the secreted protein or 
5 the mature form having a continuous series of deleted residues from the amino or the 
carboxy terminus, or both. For example, any number of amino acids, ranging from 1 - 
60, can be deleted from the amino terminus of either the secreted polypeptide or the 
mature form. Similarly, any number of amino acids, ranging from 1-30, can be deleted 
from the carboxy terminus of the secreted protein or mature form. Furthermore, any 
10 combination of the above amino and carboxy terminus deletions are preferred. 

Similarly, polynucleotide fragments encoding these polypeptide fragments are also 
preferred. 

Also preferred are polypeptide and polynucleotide fragments characterized by 
structural or functional domains, such as fragments that comprise alpha-helix and alpha- 

15 helix forming regions, beta-sheet and beta-sheet- forming regions, turn and turn- 
forming regions, coil and coil-forming regions, hydrophilic regions, hydrophobic 
regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface- 
forming regions, substrate binding region, and high antigenic index regions. 
Polypeptide fragments of SEQ ID NO: Y falling within conserved domains are 

20 specifically contemplated by the present invention. Moreover, polynucleotide 
fragments encoding these domains are also contemplated. 

Other preferred fragments are biologically active fragments. Biologically active 
fragments are those exhibiting activity similar, but not necessarily identical, to an 
activity of the polypeptide of the present invention. The biological activity of the 

25 fragments may include an improved desired activity, or a decreased undesirable activity. 

Epitopes & Antibodies 

In the present invention, "epitopes" refer to polypeptide fragments having 
antigenic or immunogenic activity in an animal, especially in a human. A preferred 

30 embodiment of the present invention relates to a polypeptide fragment comprising an 
epitope, as well as the polynucleotide encoding this fragment. A region of a protein 
molecule to which an antibody can bind is defined as an "antigenic epitope." In 
contrast, an "immunogenic epitope" is defined as a part of a protein that elicits an 
antibody response. (See, for instance, Geysen et al., Proc. Natl. Acad. Sci. USA 

35 81:3998-4002(1983).) 
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10 



Fragments which function as epitopes may be produced by any conventional 
means. (See, e.g., Houghten, R. A., Proc. Natl. Acad. Sci. USA 82:5131-5135 
(1985) further described in U.S. Patent No. 4,631,21 1.) 

In the present invention, antigenic epitopes preferably contain a sequence of at 
least seven, more preferably at least nine, and most preferably between about 15 to 
about 30 amino acids. Antigenic epitopes are useful to raise antibodies, including 
monoclonal antibodies, that specifically bind the epitope. (See, for instance, Wilson et 
al Cell 37:767-778 (1984); Sutcliffe, J. G. et al., Science 219:660-666 (1983).) 

Similarly, immunogenic epitopes can be used to induce antibodies according to 
methods well known in the art. (See, for instance, Sutcliffe et al., supra; Wilson et al., 
supra; Chow, M. et al., Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle, F. J. et 
al., J. Gen. Virol. 66:2347-2354 (1985).) A preferred immunogenic epitope includes 
the secreted protein. The immunogenic epitopes may be presented together with a 
carrier protein, such as an albumin, to an animal system (such as rabbit or mouse) or, if 
15 it is long enough (at least about 25 amino acids), without a carrier. However, 

immunogenic epitopes comprising as few as 8 to 10 amino acids have been shown to be 
sufficient to raise antibodies capable of binding to, at the very least, linear epitopes in a 
denatured polypeptide (e.g., in Western blotting.) 

As used herein, the term "antibody" (Ab) or "monoclonal antibody" (Mab) is 
20 meant to include intact molecules as well as antibody fragments (such as. for example, 
Fab and F(ab')2 fragments) which are capable of specifically binding to protein. Fab 
and F(ab')2 fragments lack the Fc fragment of intact antibody, clear more rapidly from 
the circulation, and may have less non-specific tissue binding than an intact antibody. 
(Wahl et al., J. Nucl. Med. 24:316-325 (1983).) Thus, these fragments are preferred, 
25 as well as the products of a FAB or other immunoglobulin expression library. 
Moreover, antibodies of the present invention include chimeric, single chain, and 
humanized antibodies. 



Fusion Proteins 

30 Any polypeptide of the present invention can be used to generate fusion 

proteins. For example, the polypeptide of the present invention, when fused to a 
second protein, can be used as an antigenic tag. Antibodies raised against the 
polypeptide of the present invention can be used to indirectly detect the second protein 
by binding to the polypeptide. Moreover, because secreted proteins target cellular 

35 locations based on trafficking signals, the polypeptides of the present invention can be 
used as targeting molecules once fused to other proteins. 
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Examples of domains that can be fused to polypeptides of the present invention 
include not only heterologous signal sequences, but also other heterologous functional 
regions. The fusion does not necessarily need to be direct, but may occur through 
linker sequences. 

5 Moreover, fusion proteins may also be engineered to improve characteristics of 

the polypeptide of the present invention. For instance, a region of additional amino 
acids, particularly charged amino acids, may be added to the N-terminus of the 
polypeptide to improve stability and persistence during purification from the host cell or 
subsequent handling and storage. Also, peptide moieties may be added to the 

10 polypeptide to facilitate purification. Such regions may be removed prior to final 

preparation of the polypeptide. The addition of peptide moieties to facilitate handling of 
polypeptides are familiar and routine techniques in the art. 

Moreover, polypeptides of the present invention, including fragments, and 
specifically epitopes, can be combined with parts of the constant domain of 

15 immunoglobulins (IgG), resulting in chimeric polypeptides. These fusion proteins 
facilitate purification and show an increased half-life in vivo. One reported example 
describes chimeric proteins consisting of the first two domains of the human CD4- 
polypeptide and various domains of the constant regions of the heavy or light chains of 
mammalian immunoglobulins. (EP A 394,827; Traunecker et al., Nature 331:84-86 

20 (1988).) Fusion proteins having disulfide-linked dimeric structures (due to the IgG) 
can also be more efficient in binding and neutralizing other molecules, than the 
monomeric secreted protein or protein fragment alone. (Fountoulakis et al., J. 
Biochem. 270:3958-3964 (1995).) 

Similarly, EP-A-O 464 533 (Canadian counterpart 2045869) discloses fusion 

25 proteins comprising various portions of constant region of immunoglobulin molecules 
together with another human protein or part thereof. In many cases, the Fc part in a 
fusion protein is beneficial in therapy and diagnosis, and thus can result in, for 
example, improved pharmacokinetic properties. (EP-A 0232 262.) Alternatively, 
deleting the Fc part after the fusion protein has been expressed, detected, and purified, 

30 would be desired. For example, the Fc portion may hinder therapy and diagnosis if the 
fusion protein is used as an antigen for immunizations. In drug discovery, for 
example, human proteins, such as hIL-5, have been fused with Fc portions for the 
purpose of high-throughput screening assays to identify antagonists of hIL-5. (See, D. 
Bennett et al., J. Molecular Recognition 8:52-58 (1995); K. Johanson et al., J. Biol. 

35 Chem. 270:9459-947 1 ( 1995).) 

Moreover, the polypeptides of the present invention can be fused to marker 
sequences, such as a peptide which facilitates purification of the fused polypeptide. In 
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preferred embodiments, the marker amino acid sequence is a hexa-histidine peptide, 
such as the tag provided in a pQE vector (QIAGEN, Inc., 9259 Eton Avenue, 
Chatsworth, CA, 9131 1), among others, many of which are commercially available. 
As described in Gentz et al., Proc. Natl. Acad. Sci. USA 86:821-824 (1989), for 
5 instance, hexa-histidine provides for convenient purification of the ftision protein. 
Another peptide tag useful for purification, the "HA" tag, corresponds to an epitope 
derived from the influenza hemagglutinin protein. (Wilson et al., Cell 37:767 (1984).) 

Thus, any of these above fusions can be engineered using the polynucleotides 
or the polypeptides of the claimed invention. 

10 

Vectors. Host Cells, and Protein Production 

The present invention also relates to vectors containing the polynucleotide of the 
present invention, host cells, and the production of polypeptides by recombinant 
techniques. The vector may be, for example, a phage, plasmid, viral, or retroviral 

15 vector. Retroviral vectors may be replication competent or replication defective. In the 
latter case, viral propagation generally will occur only in complementing host cells. 

The polynucleotides may be joined to a vector containing a selectable marker for 
propagation in a host. Generally, a plasmid vector is introduced in a precipitate, such 
as a calcium phosphate precipitate, or in a complex with a charged lipid. If the vector is 

20 a virus, it may be packaged in vitro using an appropriate packaging cell line and then 
transduced into host cells. 

The polynucleotide insert should be operatively linked to an appropriate 
promoter, such as the phage lambda PL promoter, the E. coli lac, trp, phoA and tac 
promoters, the SV40 early and late promoters and promoters of retroviral LTRs, to 

25 name a few. Other suitable promoters will be known to the skilled artisan. The 

expression constructs will further contain sites for transcription initiation, termination, 
and, in the transcribed region, a ribosome binding site for translation. The coding 
portion of the transcripts expressed by the constructs will preferably include a 
translation initiating codon at the beginning and a termination codon (UAA, UGA or 

30 UAG) appropriately positioned at the end of the polypeptide to be translated. 

As indicated, the expression vectors will preferably include at least one 
selectable marker. Such markers include dihydrofolate reductase, G418 or neomycin 
resistance for eukaryotic cell culture and tetracycline, kanamycin or ampicillin resistance 
genes for culturing in E. coli and other bacteria. Representative examples of 

35 appropriate hosts include, but are not limited to, bacterial cells, such as E. coli, 

Streptomyces and Salmonella typhimurium cells; fungal cells, such as yeast cells; insect 
cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as CHO, COS, 
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293, and Bowes melanoma cells; and plant cells. Appropriate culture mediums and 
conditions for the above-described host cells are known in the art. 

Among vectors preferred for use in bacteria include pQE70, pQE60 and pQE-9, 
available from QIAGEN, Inc.; pBluescript vectors, Phagescript vectors, pNH8A, 
5 pNH16a, pNH18A, pNH46A, available from Stratagene Cloning Systems, Inc.; and 
ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 available from Pharmacia Biotech, 
Inc. Among preferred eukaryotic vectors are pWLNEO, pSV2CAT, pOG44, pXTi 
and pSG available from Stratagene; and pSVK3, pBPV, pMSG and pSVL available 
from Pharmacia. Other suitable vectors will be readily apparent to the skilled artisan. 

10 Introduction of the construct into the host cell can be effected by calcium 

phosphate transfection, DEAE-dextran mediated transfection, cationic lipid-mediated 
transfection, electroporation, transduction, infection, or other methods. Such methods 
are described in many standard laboratory manuals, such as Davis et al., Basic Methods 
In Molecular Biology (1986). It is specifically contemplated that the polypeptides of the 

1 5 present invention may in fact be expressed by a host cell lacking a recombinant vector. 

A polypeptide of this invention can be recovered and purified from recombinant 
cell cultures by well-known methods including ammonium sulfate or ethanol 
precipitation, acid extraction, anion or cation exchange chromatography, 
phosphocellulose chromatography, hydrophobic interaction chromatography, affinity 

20 chromatography, hydroxylapatite chromatography and lectin chromatography. Most 
preferably, high performance liquid chromatography ("HPLC") is employed for 
purification. 

Polypeptides of the present invention, and preferably the secreted form, can also 
be recovered from: products purified from natural sources, including bodily fluids, 

25 tissues and cells, whether directly isolated or cultured; products of chemical synthetic 
procedures; and products produced by recombinant techniques from a prokaryotic or 
eukaryotic host, including, for example, bacterial, yeast, higher plant, insect, and 
mammalian cells. Depending upon the host employed in a recombinant production 
procedure, the polypeptides of the present invention may be glycosylated or may be 

30 non-glycosylated. In addition, polypeptides of the invention may also include an initial 
modified methionine residue, in some cases as a result of host-mediated processes. 
Thus, it is well known in the art that the N-terminal methionine encoded by the 
translation initiation codon generally is removed with high efficiency from any protein 
after translation in all eukaryotic cells. While the N-terminal methionine on most 

35 proteins also is efficiently removed in most prokaryotes, for some proteins, this 

prokaryotic removal process is inefficient, depending on the nature of the amino acid to 
which the N-terminal methionine is covalently linked. 
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Uses of the Polynucleotides 

Each of the polynucleotides identified herein can be used in numerous ways as 
reagents. The following description should be considered exemplary and utilizes 
5 known techniques. 

The polynucleotides of the present invention are useful for chromosome 
identification. There exists an ongoing need to identify new chromosome markers, 
since few chromosome marking reagents, based on actual sequence data (repeat 
polymorphisms), are presently available. Each polynucleotide of the present invention 
10 can be used as a chromosome marker. 

Briefly, sequences can be mapped to chromosomes by preparing PCR primers 
(preferably 15-25 bp) from the sequences shown in SEQ ID NO:X. Primers can be 
selected using computer analysis so that primers do not span more than one predicted 
exon in the genomic DNA. These primers are then used for PCR screening of somatic 
15 cell hybrids containing individual human chromosomes. Only those hybrids containing 
the human gene corresponding to the SEQ ID NO:X will yield an amplified fragment. 

Similarly, somatic hybrids provide a rapid method of PCR mapping the 
polynucleotides to particular chromosomes. Three or more clones can be assigned per 
day using a single thermal cycler. Moreover, sublocalization of the polynucleotides can 
20 be achieved with panels of specific chromosome fragments. Other gene mapping 

strategies that can be used include in situ hybridization, prescreening with labeled flow- 
sorted chromosomes, and preselection by hybridization to construct chromosome 
specific-cDNA libraries. 

Precise chromosomal location of the polynucleotides can also be achieved using 
25 fluorescence in situ hybridization (FISH) of a metaphase chromosomal spread. This 
technique uses polynucleotides as short as 500 or 600 bases; however, polynucleotides 
2,000-4,000 bp are preferred. For a review of this technique, see Verma et al., 
"Human Chromosomes: a Manual of Basic Techniques," Pergamon Press, New York 
(1988). 

30 For chromosome mapping, the polynucleotides can be used individually (to 

mark a single chromosome or a single site on that chromosome) or in panels (for 
marking multiple sites and/or multiple chromosomes). Preferred polynucleotides 
correspond to the noncoding regions of the cDNAs because the coding sequences are 
more likely conserved within gene families, thus increasing the chance of cross 

35 hybridization during chromosomal mapping. 

Once a polynucleotide has been mapped to a precise chromosomal location, the 
physical position of the polynucleotide can be used in linkage analysis. Linkage 
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analysis establishes coinheritance between a chromosomal location and presentation of a 
particular disease. (Disease mapping data are found, for example, in V. McKusick, 
Mendclian Inheritance in Man (available on line through Johns Hopkins University 
Welch Medical Library) .) Assuming 1 megabase mapping resolution and one gene per 
5 20 kb, a cDNA precisely localized to a chromosomal region associated with the disease 
could be one of 50-500 potential causative genes. 

Thus, once coinheritance is established, differences in the polynucleotide and 
the corresponding gene between affected and unaffected individuals can be examined. 
First, visible structural alterations in the chromosomes, such as deletions or 

10 translocations, are examined in chromosome spreads or by PCR. If no structural 

alterations exist, the presence of point mutations are ascertained. Mutations observed in 
some or all affected individuals, but not in normal individuals, indicates that the 
mutation may cause the disease. However, complete sequencing of the polypeptide and 
the corresponding gene from several normal individuals is required to distinguish the 

15 mutation from a polymorphism. If a new polymorphism is identified, this polymorphic 
polypeptide can be used for further linkage analysis. 

Furthermore, increased or decreased expression of the gene in affected 
individuals as compared to unaffected individuals can be assessed using 
polynucleotides of the present invention. Any of these alterations (altered expression, 

20 chromosomal rearrangement, or mutation) can be used as a diagnostic or prognostic 
marker. 

In addition to the foregoing, a polynucleotide can be used to control gene 
expression through triple helix formation or antisense DNA or RN A. Both methods 
rely on binding of the polynucleotide to DNA or RNA. For these techniques, preferred 

25 polynucleotides are usually 20 to 40 bases in length and complementary to either the 
region of the gene involved in transcription (triple helix - see Lee et al., Nucl. Acids 
Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et al., Science 
251:1360 (1991) ) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxy-nucleotides as Antisense Inhibitors of Gene Expression, CRC 

30 Press, Boca Raton, FL (1988).) Triple helix formation optimally results in a shut-off 
of RNA transcription from DNA, while antisense RNA hybridization blocks translation 
of an mRNA molecule into polypeptide. Both techniques are effective in model 
systems, and the information disclosed herein can be used to design antisense or triple 
helix polynucleotides in an effort to treat disease. 

35 Polynucleotides of the present invention are also useful in gene therapy. One 

goal of gene therapy is to insert a normal gene into an organism having a defective 
gene, in an effort to correct the genetic defect. The polynucleotides disclosed in the 
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present invention offer a means of targeting such genetic defects in a highly accurate 
manner. Another goal is to insert a new gene that was not present in the host genome, 
thereby producing a new trait in the host cell. 

The polynucleotides are also useful for identifying individuals from minute 
5 biological samples. The United States military, for example, is considering the use of 
restriction fragment length polymorphism (RFLP) for identification of its personnel. In 
this technique, an individual's genomic DNA is digested with one or more restriction 
enzymes, and probed on a Southern blot to yield unique bands for identifying 
personnel. This method does not suffer from the current limitations of "Dog Tags" 
10 which can be lost, switched, or stolen, making positive identification difficult. The 
polynucleotides of the present invention can be used as additional DNA markers for 
RFLP. 

The polynucleotides of the present invention can also be used as an alternative to 
RFLP, by determining the actual base-by-base DNA sequence of selected portions of an 

15 individual's genome. These sequences can be used to prepare PCR primers for 

amplifying and isolating such selected DNA, which can then be sequenced. Using this 
technique, individuals can be identified because each individual will have a unique set 
of DNA sequences. Once an unique ID database is established for an individual, 
positive identification of that individual, living or dead, can be made from extremely 

20 small tissue samples. 

Forensic biology also benefits from using DNA-based identification techniques 
as disclosed herein. DNA sequences taken from very small biological samples such as 
tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, semen, etc., can be 
amplified using PCR. In one prior art technique, gene sequences amplified from 

25 polymorphic loci, such as DQa class II HLA gene, are used in forensic biology to 

identify individuals. (Erlich, H., PCR Technology, Freeman and Co. (1992).) Once 
these specific polymorphic loci are amplified, they are digested with one or more 
restriction enzymes, yielding an identifying set of bands on a Southern blot probed with 
DNA corresponding to the DQa class II HLA gene. Similarly, polynucleotides of the 

30 present invention can be used as polymorphic markers for forensic purposes. 

There is also a need for reagents capable of identifying the source of a particular 
tissue. Such need arises, for example, in forensics when presented with tissue of 
unknown origin. Appropriate reagents can comprise, for example, DNA probes or 
primers specific to particular tissue prepared from the sequences of the present 

35 invention. Panels of such reagents can identify tissue by species and/or by organ type. 
In a similar fashion, these reagents can be used to screen tissue cultures for 
contamination. 
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In the very least t the polynucleotides of the present invention can be used as 
molecular weight markers on Southern gels, as diagnostic probes for the presence of a 
specific mRNA in a particular cell type, as a probe to "subtract-out" known sequences 
in the process of discovering novel polynucleotides, for selecting and making oligomers 
5 for attachment to a "gene chip' 1 or other support, to raise anti-DNA antibodies using 
DNA immunization techniques, and as an antigen to elicit an immune response. 

Uses of the Polypeptides 

Each of the polypeptides identified herein can be used in numerous ways. The 
10 following description should be considered exemplary and utilizes known techniques. 

A polypeptide of the present invention can be used to assay protein levels in a 
biological sample using antibody-based techniques. For example, protein expression in 
tissues can be studied with classical immunohistological methods. (Jalkanen, M., et 
al M J. Cell. Biol. 101:976-985 (1985); Jalkanen, M., et al., J. Cell . Biol. 105:3087- 
1 5 3096 ( 1 987).) Other antibody-based methods useful for detecting protein gene 

expression include immunoassays, such as the enzyme linked immunosorbent assay 
(ELISA) and the radioimmunoassay (RIA). Suitable antibody assay labels are known 
in the art and include enzyme labels, such as, glucose oxidase, and radioisotopes, such 
as iodine (1251, 1211), carbon (14C), sulfur (35S), tritium (3H), indium (1 12In), and 
20 technetium (99mTc), and fluorescent labels, such as fluorescein and rhodamine, and 
biotin. 

In addition to assaying secreted protein levels in a biological sample, proteins 
can also be detected in vivo by imaging. Antibody labels or markers for in vivo 
imaging of protein include those detectable by X-radiography, NMR or ESR. For X- 
25 radiography, suitable labels include radioisotopes such as barium or cesium, which emit 
detectable radiation but are not overtly harmful to the subject. Suitable markers for 
NMR and ESR include those with a detectable characteristic spin, such as deuterium, 
which may be incorporated into the antibody by labeling of nutrients for the relevant 
hybridoma. 

30 A protein-specific antibody or antibody fragment which has been labeled with 

an appropriate detectable imaging moiety, such as a radioisotope (for example, 1311, 
1 12In, 99mTc), a radio-opaque substance, or a material detectable by nuclear magnetic 
resonance, is introduced (for example, parenterally, subcutaneously, or 
intraperitoneally) into the mammal. It will be understood in the art that the size of the 

35 subject and the imaging system used will determine the quantity of imaging moiety 

needed to produce diagnostic images. In the case of a radioisotope moiety, for a human 
subject, the quantity of radioactivity injected will normally range from about 5 to 20 
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10 



15 



millicuries of 99mTc. The labeled antibody or antibody fragment will then 
preferentially accumulate at the location of cells which contain the specific protein In 
vivo tumor .magmg is descnbed m S. W. Burchiel et al., Immunopharmacokinetics of 
Radiolabeled Antibodies and Their Fragments." (Chapter 13 in Tumor Imaging- The 
Rad.ochemical Detection of Cancer, S.W. Burchiel and B. A. Rhodes, eds., Masson 
Publishing Inc. (1982).) 

Thus, the invention provides a diagnostic method of a disorder, which involves 
(a) assaying the expression of a polypeptide of the present invention in cells or body 
fluid of an individual; (b) comparing the level of gene expression with a standard gene 
express.on level, whereby an increase or decrease in the assayed polypeptide gene 
expression level compared to the standard expression level is indicative of a disorder. 

Moreover, polypeptides of the present invention can be used to treat disease 
For example, patients can be administered a polypeptide of the present invention in an 
effort to replace absent or decreased levels of the polypeptide (e.g., insulin), to 
supplement absent or decreased levels of a different polypeptide (e.g., hemoglobin S 
for hemoglobin B), to inhibit the activity of a polypeptide (e.g., an oncogene), to 
activate the activity of a polypeptide (e.g., by binding to a receptor), to reduce the 
activity of a membrane bound receptor by competing with it for free ligand (e.g., 
soluble TNF receptors used in reducing inflammation), or to bring about a desired 
20 response (e.g., blood vessel growth). 

Similarly, antibodies directed to a polypeptide of the present invention can also 
be used to treat disease. For example, administration of an antibody directed to a 
polypeptide of the present invention can bind and reduce overproduction of the 
polypeptide. Similarly, administration of an antibody can activate the polypeptide, such 
as by binding to a polypeptide bound to a membrane (receptor). 

At the very least, the polypeptides of the present invention could be used as 
molecular weight markers on SDS-PAGE gels or on molecular sieve gel filtration 
columns using methods well known to those of skill in the art. Polypeptides can also 
be used to raise antibodies, which in turn are used to measure protein expression from a 
recombinant cell, as a way of assessing transformation of the host cell. Moreover, the 
polypeptides of the present invention can be used to test the following biological 



25 



30 



activities. 



35 



Biologica l Artivitipg 

The polynucleotides and polypeptides of the present invention can be used in 
assays to test for one or more biological activities. If these polynucleotides and 
polypepudes do exhibit activity in a particular assay, it is likely that these molecules 
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may be involved in the diseases associated with the biological activity. Thus, the 
polynucleotides and polypeptides could be used to treat the associated disease. 

Immune Activity 

5 A polypeptide or polynucleotide of the present invention may be useful in 

treating deficiencies or disorders of the immune system, by activating or inhibiting the 
proliferation, differentiation, or mobilization (chemotaxis) of immune cells. Immune 
cells develop through a process called hematopoiesis, producing myeloid (platelets, red 
blood cells, neutrophils, and macrophages) and lymphoid (B and T lymphocytes) cells 

10 from pluripotent stem cells. The etiology of these immune deficiencies or disorders 

may be genetic, somatic, such as cancer or some autoimmune disorders, acquired (e.g., 
by chemotherapy or toxins), or infectious. Moreover, a polynucleotide or polypeptide 
of the present invention can be used as a marker or detector of a particular immune 
system disease or disorder. 

15 A polynucleotide or polypeptide of the present invention may be useful in 

treating or detecting deficiencies or disorders of hematopoietic cells. A polypeptide or 
polynucleotide of the present invention could be used to increase differentiation and 
proliferation of hematopoietic cells, including the pluripotent stem cells, in an effort to 
treat those disorders associated with a decrease in certain (or many) types hematopoietic 

20 cells. Examples of immunologic deficiency syndromes include, but are not limited to: 
blood protein disorders (e.g. agammaglobulinemia, dysgammaglobulinemia), ataxia 
telangiectasia, common variable immunodeficiency, Digeorge Syndrome, HIV 
infection, HTLV-BLV infection, leukocyte adhesion deficiency syndrome, 
lymphopenia, phagocyte bactericidal dysfunction, severe combined immunodeficiency 

25 (SCIDs), Wiskott-Aldrich Disorder, anemia, thrombocytopenia, or hemoglobinuria. 

Moreover, a polypeptide or polynucleotide of the present invention could also 
be used to modulate hemostatic (the stopping of bleeding) or thrombolytic activity (clot 
formation). For example, by increasing hemostatic or thrombolytic activity, a 
polynucleotide or polypeptide of the present invention could be used to treat blood 

30 coagulation disorders (e.g., afibrinogenemia, factor deficiencies), blood platelet 

disorders (e.g. thrombocytopenia), or wounds resulting from trauma, surgery, or other 
causes. Alternatively, a polynucleotide or polypeptide of the present invention that can 
decrease hemostatic or thrombolytic activity could be used to inhibit or dissolve 
clotting. These molecules could be important in the treatment of heart attacks 

35 (infarction), strokes, or scarring. 

A polynucleotide or polypeptide of the present invention may also be useful in 
treating or detecting autoimmune disorders. Many autoimmune disorders result from 
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inappropriate recognition of self as foreign material by immune cells. This 
inappropriate recognition results in an immune response leading to the destruction of the 
host tissue. Therefore, the administration of a polypeptide or polynucleotide of the 
present invention that inhibits an immune response, particularly the proliferation, 
5 differentiation, or chemotaxis of T-cells, may be an effective therapy in preventing 
autoimmune disorders. 

Examples of autoimmune disorders that can be treated or detected by the present 
invention include, but are not limited to: Addison's Disease, hemolytic anemia, 
antiphospholipid syndrome, rheumatoid arthritis, dermatitis, allergic encephalomyelitis, 

10 glomerulonephritis, Goodpasture's Syndrome, Graves' Disease, Multiple Sclerosis, 
Myasthenia Gravis, Neuritis, Ophthalmia, Bullous Pemphigoid, Pemphigus, 
Polyendocrinopathies, Purpura, Reiter's Disease, Stiff-Man Syndrome, Autoimmune 
Thyroiditis, Systemic Lupus Erythematosus, Autoimmune Pulmonary Inflammation, 
Guillain-Barre Syndrome, insulin dependent diabetes mellitis, and autoimmune 

15 inflammatory eye disease. 

Similarly, allergic reactions and conditions, such as asthma (particularly allergic 
asthma) or other respiratory problems, may also be treated by a polypeptide or 
polynucleotide of the present invention. Moreover, these molecules can be used to treat 
anaphylaxis, hypersensitivity to an antigenic molecule, or blood group incompatibility. 

20 A polynucleotide or polypeptide of the present invention may also be used to 

treat and/or prevent organ rejection or graft-versus-host disease (GVHD). Organ 
rejection occurs by host immune cell destruction of the transplanted tissue through an 
immune response. Similarly, an immune response is also involved in GVHD, but, in 
this case, the foreign transplanted immune cells destroy the host tissues. The 

25 administration of a polypeptide or polynucleotide of the present invention that inhibits 
an immune response, particularly the proliferation, differentiation, or chemotaxis of T- 
cells, may be an effective therapy in preventing organ rejection or GVHD. 

Similarly, a polypeptide or polynucleotide of the present invention may also be 
used to modulate inflammation. For example, the polypeptide or polynucleotide may 

30 inhibit the proliferation and differentiation of cells involved in an inflammatory 

response. These molecules can be used to treat inflammatory conditions, both chronic 
and acute conditions, including inflammation associated with infection (e.g., septic 
shock, sepsis, or systemic inflammatory response syndrome (SIRS)), ischemia- 
reperfusion injury, endotoxin lethality, arthritis, complement-mediated hyperacute 

35 rejection, nephritis, cytokine or chemokine induced lung injury, inflammatory bowel 
disease, Crohn's disease, or resulting from over production of cytokines (e.g., TNF or 
IL-1.) 
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Hvperproliferative Disorders 

A polypeptide or polynucleotide can be used to treat or detect hyperproliferativc 
disorders, including neoplasms. A polypeptide or polynucleotide of the present 
5 invention may inhibit the proliferation of the disorder through direct or indirect 

interactions. Alternatively, a polypeptide or polynucleotide of the present invention 
may proliferate other cells which can inhibit the hyperproliferative disorder. 

For example, by increasing an immune response, particularly increasing 
antigenic qualities of the hyperproliferative disorder or by proliferating, differentiating, 
10 or mobilizing T-cells, hyperproliferative disorders can be treated. This immune 
response may be increased by either enhancing an existing immune response, or by 
initiating a new immune response. Alternatively, decreasing an immune response may 
also be a method of treating hyperproliferative disorders, such as a chemotherapeutic 
agent. 

15 Examples of hyperproliferative disorders that can be treated or detected by a 

polynucleotide or polypeptide of the present invention include, but are not limited to 
neoplasms located in the: abdomen, bone, breast, digestive system, liver, pancreas, 
peritoneum, endocrine glands (adrenal, parathyroid, pituitary, testicles, ovary, thymus, 
thyroid), eye, head and neck, nervous (central and peripheral), lymphatic system, 

20 pelvic, skin, soft tissue, spleen, thoracic, and urogenital. 

Similarly, other hyperproliferative disorders can also be treated or detected by a 
polynucleotide or polypeptide of the present invention. Examples of such 
hyperproliferative disorders include, but are not limited to: hypergammaglobulinemia, 
lymphoproliferative disorders, paraproteinemias, purpura, sarcoidosis, Sezary 

25 Syndrome, Waldenstron's Macroglobulinemia, Gaucher's Disease, histiocytosis, and 
any other hyperproliferative disease, besides neoplasia, located in an organ system 
listed above. 

Infectious Disease 

30 A polypeptide or polynucleotide of the present invention can be used to treat or 

detect infectious agents. For example, by increasing the immune response, particularly 
increasing the proliferation and differentiation of B and/or T cells, infectious diseases 
may be treated. The immune response may be increased by either enhancing an existing 
immune response, or by initiating a new immune response. Alternatively, the 

35 polypeptide or polynucleotide of the present invention may also directly inhibit the 
infectious agent, without necessarily eliciting an immune response. 
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Viruses are one example of an infectious agent that can cause disease or 
symptoms that can be treated or detected by a polynucleotide or polypeptide of the 
present invention. Examples of viruses, include, but are not limited to the following 
DNA and RNA viral families: Arbovirus, Adenoviridae, Arenaviridae, Arterivirus, 
5 Birnaviridae, Bunyaviridae, Caliciviridae, Circoviridae, Coronaviridae, Flaviviridae, 
Hepadnaviridae (Hepatitis), Herpesviridae (such as, Cytomegalovirus, Herpes 
Simplex, Herpes Zoster), Mononegavirus (e.g., Paramyxoviridae, Morbillivirus, 
Rhabdoviridae), Orthomyxoviridae (e.g., Influenza), Papovaviridae, Parvoviridae, 
Picornaviridae, Poxviridae (such as Smallpox or Vaccinia), Reoviridae (e.g., 

10 Rotavirus), Retroviridae (HTLV-I, HTLV-II, Lentivirus), and Togaviridae (e.g., 
Rubivirus). Viruses falling within these families can cause a variety of diseases or 
symptoms, including, but not limited to: arthritis, bronchiolitis, encephalitis, eye 
infections (e.g., conjunctivitis, keratitis), chronic fatigue syndrome, hepatitis (A, B, C, 
E, Chronic Active, Delta), meningitis, opportunistic infections (e.g., AIDS), 

15 pneumonia, Burkitt's Lymphoma, chickenpox , hemorrhagic fever, Measles, Mumps, 
Parainfluenza, Rabies, the common cold, Polio, leukemia, Rubella, sexually 
transmitted diseases, skin diseases (e.g., Kaposi's, warts), and viremia. A polypeptide 
or polynucleotide of the present invention can be used to treat or detect any of these 
symptoms or diseases. 

20 Similarly, bacterial or fungal agents that can cause disease or symptoms and that 

can be treated or detected by a polynucleotide or polypeptide of the present invention 
include, but not limited to, the following Gram-Negative and Gram-positive bacterial 
families and fungi: Actinomycetales (e.g., Corynebacterium, Mycobacterium, 
Norcardia), Aspergillosis, Bacillaceae (e.g., Anthrax, Clostridium), Bacteroidaceae, 

25 Blastomycosis, Bordetella, Borrelia, Brucellosis, Candidiasis, Campylobacter, 

Coccidioidomycosis, Cryptococcosis, Dermatocycoses, Enterobacteriaceae (Klebsiella, 
Salmonella, Serratia, Yersinia), Erysipelothrix, Helicobacter, Legionellosis, 
Leptospirosis, Listeria, Mycoplasmatales, Neisseriaceae (e.g., Acinetobacter, 
Gonorrhea, Menigococcal), Pasteurellacea Infections (e.g., Actinobacillus, 

30 Heamophilus, Pasteurella), Pseudomonas, Rickettsiaceae, Chlamydiaceae, Syphilis, 
and Staphylococcal. These bacterial or fungal families can cause the following diseases 
or symptoms, including, but not limited to: bacteremia, endocarditis, eye infections 
(conjunctivitis, tuberculosis, uveitis), gingivitis, opportunistic infections (e.g., AIDS 
related infections), paronychia, prosthesis-related infections, Reiter's Disease, 

35 respiratory tract infections, such as Whooping Cough or Empyema, sepsis, Lyme 
Disease, Cat-Scratch Disease, Dysentery, Paratyphoid Fever, food poisoning, 
Typhoid, pneumonia, Gonorrhea, meningitis, Chlamydia, Syphilis, Diphtheria, 
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Leprosy, Paratuberculosis, Tuberculosis, Lupus, Botulism, gangrene, tetanus, 
impetigo, Rheumatic Fever, Scarlet Fever, sexually transmitted diseases, skin diseases 
(e.g., cellulitis, dermatocycoses), toxemia, urinary tract infections, wound infections. 
A polypeptide or polynucleotide of the present invention can be used to treat or detect 
5 any of these symptoms or diseases. 

Moreover, parasitic agents causing disease or symptoms that can be treated or 
detected by a polynucleotide or polypeptide of the present invention include, but not 
limited to, the following families: Amebiasis, Babesiosis, Coccidiosis, 
Cryptosporidiosis, Dientamoebiasis, Dourine, Ectoparasitic, Giardiasis, Helminthiasis, 

1 0 Leishmaniasis, Theileriasis, Toxoplasmosis, Trypanosomiasis, and Trichomonas. 

These parasites can cause a variety of diseases or symptoms, including, but not limited 
to: Scabies, Trombiculiasis, eye infections, intestinal disease (e.g., dysentery, 
giardiasis), liver disease, lung disease, opportunistic infections (e.g., AIDS related), 
Malaria, pregnancy complications, and toxoplasmosis. A polypeptide or polynucleotide 

1 5 of the present invention can be used to treat or detect any of these symptoms or 
diseases. 

Preferably, treatment using a polypeptide or polynucleotide of the present 
invention could either be by administering an effective amount of a polypeptide to the 
patient, or by removing cells from the patient, supplying the cells with a polynucleotide 
20 of the present invention, and returning the engineered cells to the patient (ex vivo 

therapy). Moreover, the polypeptide or polynucleotide of the present invention can be 
used as an antigen in a vaccine to raise an immune response against infectious disease. 



Regeneration 

25 A polynucleotide or polypeptide of the present invention can be used to 

differentiate, proliferate, and attract cells, leading to the regeneration of tissues. (See, 
Science 276:59-87 (1997).) The regeneration of tissues could be used to repair, 
replace, or protect tissue damaged by congenital defects, trauma (wounds, burns, 
incisions, or ulcers), age, disease (e.g. osteoporosis, osteocarthritis, periodontal 

30 disease, liver failure), surgery, including cosmetic plastic surgery, fibrosis, reperfusion 
injury, or systemic cytokine damage. 

Tissues that could be regenerated using the present invention include organs 
(e.g., pancreas, liver, intestine, kidney, skin, endothelium), muscle (smooth, skeletal 
or cardiac), vascular (including vascular endothelium), nervous, hematopoietic, and 

35 skeletal (bone, cartilage, tendon, and ligament) tissue. Preferably, regeneration occurs 
without or decreased scarring. Regeneration also may include angiogenesis. 
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Moreover, a polynucleotide or polypeptide of the present invention may increase 
regeneration of tissues difficult to heal. For example, increased tendon/ligament 
regeneration would quicken recovery time after damage. A polynucleotide or 
polypeptide of the present invention could also be used prophylactically in an effort to 
5 avoid damage. Specific diseases that could be treated include of tendinitis, carpal tunnel 
syndrome, and other tendon or ligament defects. A further example of tissue 
regeneration of non-healing wounds includes pressure ulcers, ulcers associated with 
vascular insufficiency, surgical, and traumatic wounds. 

Similarly, nerve and brain tissue could also be regenerated by using a 

10 polynucleotide or polypeptide of the present invention to proliferate and differentiate 
nerve cells. Diseases that could be treated using this method include central and 
peripheral nervous system diseases, neuropathies, or mechanical and traumatic 
disorders (e.g., spinal cord disorders, head trauma, cerebrovascular disease, and 
stoke). Specifically, diseases associated with peripheral nerve injuries, peripheral 

15 neuropathy (e.g., resulting from chemotherapy or other medical therapies), localized 
neuropathies, and central nervous system diseases (e.g., Alzheimer's disease, 
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy- 
Drager syndrome), could all be treated using the polynucleotide or polypeptide of the 
present invention. 

20 

Chemotaxis 

A polynucleotide or polypeptide of the present invention may have chemotaxis 
activity. A chemotaxic molecule attracts or mobilizes cells (e.g., monocytes, 
fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial 

25 cells) to a particular site in the body, such as inflammation, infection, or site of 

hyperproliferation. The mobilized cells can then fight off and/or heal the particular 
trauma or abnormality. 

A polynucleotide or polypeptide of the present invention may increase 
chemotaxic activity of particular cells. These chemotactic molecules can then be used to 

30 treat inflammation, infection, hyperproliferative disorders, or any immune system 

disorder by increasing the number of cells targeted to a particular location in the body. 
For example, chemotaxic molecules can be used to treat wounds and other trauma to 
tissues by attracting immune cells to the injured location, Chemotactic molecules of the 
present invention can also attract fibroblasts, which can be used to treat wounds. 

35 It is also contemplated that a polynucleotide or polypeptide of the present 

invention may inhibit chemotactic activity. These molecules could also be used to treat 
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disorders. Thus, a polynucleotide or polypeptide of the present invention could be used 
as an inhibitor of chemotaxis. 

Binding Activity 

5 A polypeptide of the present invention may be used to screen for molecules that 

bind to the polypeptide or for molecules to which the polypeptide binds. The binding 
of the polypeptide and the molecule may activate (agonist), increase, inhibit 
(antagonist), or decrease activity of the polypeptide or the molecule bound. Examples 
of such molecules include antibodies, oligonucleotides, proteins (e.g., receptors),or 

10 small molecules. 

Preferably, the molecule is closely related to the natural ligand of the 
polypeptide, e.g., a fragment of the ligand, or a natural substrate, a ligand, a structural 
or functional mimetic. (See, Coligan et al., Current Protocols in Immunology 
l(2):Chapter 5 (1991).) Similarly, the molecule can be closely related to the natural 

1 5 receptor to which the polypeptide binds, or at least, a fragment of the receptor capable 
of being bound by the polypeptide (e.g., active site). In either case, the molecule can 
be rationally designed using known techniques. 

Preferably, the screening for these molecules involves producing appropriate 
cells which express the polypeptide, either as a secreted protein or on the cell 

20 membrane. Preferred cells include cells from mammals, yeast, Drosophila, or E. coli. 
Cells expressing the polypeptide (or cell membrane containing the expressed 
polypeptide) are then preferably contacted with a test compound potentially containing 
the molecule to observe binding, stimulation, or inhibition of activity of either the 
polypeptide or the molecule. 

25 The assay may simply test binding of a candidate compound to the polypeptide, 

wherein binding is detected by a label, or in an assay involving competition with a 
labeled competitor. Further, the assay may test whether the candidate compound results 
in a signal generated by binding to the polypeptide. 

Alternatively, the assay can be carried out using cell-free preparations, 

30 polypeptide/molecule affixed to a solid support, chemical libraries, or natural product 
mixtures. The assay may also simply comprise the steps of mixing a candidate 
compound with a solution containing a polypeptide, measuring polypeptide/molecule 
activity or binding, and comparing the polypeptide/molecule activity or binding to a 
standard. 

35 Preferably, an ELISA assay can measure polypeptide level or activity in a 

sample (e.g., biological sample) using a monoclonal or polyclonal antibody. The 
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antibody can measure polypeptide level or activity by either binding, directly or 
indirectly, to the polypeptide or by competing with the polypeptide for a substrate. 

All of these above assays can be used as diagnostic or prognostic markers. The 
molecules discovered using these assays can be used to treat disease or to bring about a 
5 particular result in a patient (e.g., blood vessel growth) by activating or inhibiting the 
polypeptide/molecule. Moreover, the assays can discover agents which may inhibit or 
enhance the production of the polypeptide from suitably manipulated cells or tissues. 

Therefore, the invention includes a method of identifying compounds which 
bind to a polypeptide of the invention comprising the steps of: (a) incubating a 
10 candidate binding compound with a polypeptide of the invention; and (b) determining if 
binding has occurred. Moreover, the invention includes a method of identifying 
agonists/antagonists comprising the steps of: (a) incubating a candidate compound with 
a polypeptide of the invention, (b) assaying a biological activity , and (b) determining if 
a biological activity of the polypeptide has been altered. 

15 

Other Activities 

A polypeptide or polynucleotide of the present invention may also increase or 
decrease the differentiation or proliferation of embryonic stem cells, besides, as 
discussed above, hematopoietic lineage. 

20 A polypeptide or polynucleotide of the present invention may also be used to 

modulate mammalian characteristics, such as body height, weight, hair color, eye color, 
skin, percentage of adipose tissue, pigmentation, size, and shape (e.g., cosmetic 
surgery). Similarly, a polypeptide or polynucleotide of the present invention may be 
used to modulate mammalian metabolism affecting catabolism, anabolism, processing, 

25 utilization, and storage of energy. 

A polypeptide or polynucleotide of the present invention may be used to change 
a mammal's mental state or physical state by influencing biorhythms, caricadic 
rhythms, depression (including depressive disorders), tendency for violence, tolerance 
for pain, reproductive capabilities (preferably by Activin or Inhibin-like activity), 

30 hormonal or endocrine levels, appetite, libido, memory, stress, or other cognitive 
qualities. 

A polypeptide or polynucleotide of the present invention may also be used as a 
food additive or preservative, such as to increase or decrease storage capabilities, fat 
content, lipid, protein, carbohydrate, vitamins, minerals, cofactors or other nutritional 
35 components. 
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Other Preferred Embodiments 

Other preferred embodiments of the claimed invention include an isolated 
nucleic acid molecule comprising a nucleotide sequence which is at least 95% identical 
to a sequence of at least about 50 contiguous nucleotides in the nucleotide sequence of 
5 SEQ ID NO:X wherein X is any integer as defined in Table 1. 

Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5* Nucleotide of the 
Clone Sequence and ending with the nucleotide at about the position of the 3* 
10 Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in Table 1. 

Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5* Nucleotide of the 
Start Codon and ending with the nucleotide at about the position of the 3' Nucleotide of 
15 the Clone Sequence as defined for SEQ ID NO:X in Table 1. 

Similarly preferred is a nucleic acid molecule wherein said sequence of 
contiguous nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the 
range of positions beginning with the nucleotide at about the position of the 5' 
Nucleotide of the First Amino Acid of the Signal Peptide and ending with the nucleotide 
20 at about the position of the 3' Nucleotide of the Clone Sequence as defined for SEQ ID 
NO:X in Table 1. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 

sequence which is at least 95% identical to a sequence of at least about 150 contiguous 

nucleotides in the nucleotide sequence of SEQ ID NO;X. 
25 Further preferred is an isolated nucleic acid molecule comprising a nucleotide 

sequence which is at least 95% identical to a sequence of at least about 500 contiguous 

nucleotides in the nucleotide sequence of SEQ ID NO:X. 

A further preferred embodiment is a nucleic acid molecule comprising a 

nucleotide sequence which is at least 95% identical to the nucleotide sequence of SEQ 
30 ID NO:X beginning with the nucleotide at about the position of the 5' Nucleotide of the 

First Amino Acid of the Signal Peptide and ending with the nucleotide at about the 

position of the 3* Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in 

Table 1. 

A further preferred embodiment is an isolated nucleic acid molecule comprising 
35 a nucleotide sequence which is at least 95% identical to the complete nucleotide 
sequence of SEQ ID NO:X. 
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Also preferred is an isolated nucleic acid molecule which hybridizes under 
stringent hybridization conditions to a nucleic acid molecule, wherein said nucleic acid 
molecule which hybridizes does not hybridize under stringent hybridization conditions 
to a nucleic acid molecule having a nucleotide sequence consisting of only A residues or 
5 of only T residues. 

Also preferred is a composition of matter comprising a DNA molecule which 
comprises a human cDNA clone identified by a cDNA Clone Identifier in Table 1, 
which DNA molecule is contained in the material deposited with the American Type 
Culture Collection and given the ATCC Deposit Number shown in Table 1 for said 
1 0 cDNA Clone Identifier. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least 50 contiguous 
nucleotides in the nucleotide sequence of a human cDNA clone identified by a cDN A 
Clone Identifier in Table 1, which DNA molecule is contained in the deposit given the 
15 ATCC Deposit Number shown in Table 1. 

Also preferred is an isolated nucleic acid molecule, wherein said sequence of at 
least 50 contiguous nucleotides is included in the nucleotide sequence of the complete 
open reading frame sequence encoded by said human cDNA clone. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
20 sequence which is at least 95% identical to sequence of at least 150 contiguous 
nucleotides in the nucleotide sequence encoded by said human cDNA clone. 

A further preferred embodiment is an isolated nucleic acid molecule comprising 
a nucleotide sequence which is at least 95% identical to sequence of at least 500 
contiguous nucleotides in the nucleotide sequence encoded by said human cDNA clone. 
25 A further preferred embodiment is an isolated nucleic acid molecule comprising 

a nucleotide sequence which is at least 95% identical to the complete nucleotide 
sequence encoded by said human cDNA clone. 

A further preferred embodiment is a method for detecting in a biological sample 
a nucleic acid molecule comprising a nucleotide sequence which is at least 95% identical 
30 to a sequence of at least 50 contiguous nucleotides in a sequence selected from the 

group consisting of: a nucleotide sequence of SEQ ID NO:X wherein X is any integer 
as defined in Table 1 ; and a nucleotide sequence encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1 ; which method 
35 comprises a step of comparing a nucleotide sequence of at least one nucleic acid 
molecule in said sample with a sequence selected from said group and determining 
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whether the sequence of said nucleic acid molecule in said sample is at least 95% 
identical to said selected sequence. 

Also preferred is the above method wherein said step of comparing sequences 
comprises determining the extent of nucleic acid hybridization between nucleic acid 
5 molecules in said sample and a nucleic acid molecule comprising said sequence selected 
from said group. Similarly, also preferred is the above method wherein said step of 
comparing sequences is performed by comparing the nucleotide sequence determined 
from a nucleic acid molecule in said sample with said sequence selected from said 
group. The nucleic acid molecules can comprise DNA molecules or RNA molecules. 

10 A further preferred embodiment is a method for identifying the species, tissue or 

cell type of a biological sample which method comprises a step of detecting nucleic acid 
molecules in said sample, if any, comprising a nucleotide sequence that is at least 95% 
identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from 
the group consisting of: a nucleotide sequence of SEQ ID NO:X wherein X is any 

1 5 integer as defined in Table 1 ; and a nucleotide sequence encoded by a human cDN A 

clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
the ATCC Deposit Number shown for said cDNA clone in Table 1 . 

The method for identifying the species, tissue or cell type of a biological sample 
can comprise a step of detecting nucleic acid molecules comprising a nucleotide 

20 sequence in a panel of at least two nucleotide sequences, wherein at least one sequence 
in said panel is at least 95% identical to a sequence of at least 50 contiguous nucleotides 
in a sequence selected from said group. 

Also preferred is a method for diagnosing in a subject a pathological condition 
associated with abnormal structure or expression of a gene encoding a secreted protein 

25 identified in Table 1 , which method comprises a step of detecting in a biological sample 
obtained from said subject nucleic acid molecules, if any, comprising a nucleotide 
sequence that is at least 95% identical to a sequence of at least 50 contiguous 
nucleotides in a sequence selected from the group consisting of: a nucleotide sequence 
of SEQ ID NO.X wherein X is any integer as defined in Table 1 ; and a nucleotide 

30 sequence encoded by a human cDNA clone identified by a cDNA Clone Identifier in 
Table 1 and contained in the deposit with the ATCC Deposit Number shown for said 
cDNA clone in Table 1. 

The method for diagnosing a pathological condition can comprise a step of 
detecting nucleic acid molecules comprising a nucleotide sequence in a panel of at least 

35 two nucleotide sequences, wherein at least one sequence in said panel is at least 95% 
identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from 
said group. 
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Also preferred is a composition of matter comprising isolated nucleic acid 
molecules wherein the nucleotide sequences of said nucleic acid molecules comprise a 
panel of at least two nucleotide sequences, wherein at least one sequence in said panel is 
at least 95% identical to a sequence of at least 50 contiguous nucleotides in a sequence 
5 selected from the group consisting of: a nucleotide sequence of SEQ ID NO:X wherein 
X is any integer as defined in Table 1 ; and a nucleotide sequence encoded by a human 
cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in the 
deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1. The 
nucleic acid molecules can comprise DNA molecules or RNA molecules. 
10 Also preferred is an isolated polypeptide comprising an amino acid sequence at 

least 90% identical to a sequence of at least about 10 contiguous amino acids in the 
amino acid sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1. 

Also preferred is a polypeptide, wherein said sequence of contiguous amino 
acids is included in the amino acid sequence of SEQ ID NO: Y in the range of positions 
15 beginning with the residue at about the position of the First Amino Acid of the Secreted 
Portion and ending with the residue at about the Last Amino Acid of the Open Reading 
Frame as set forth for SEQ ID NO: Y in Table 1 . 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to a sequence of at least about 30 contiguous amino acids in the 
20 amino acid sequence of SEQ ID NO: Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 
at least 95% identical to a sequence of at least about 100 contiguous amino acids in the 
amino acid sequence of SEQ ID NO: Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 
25 at least 95% identical to the complete amino acid sequence of SEQ ID NO: Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 
at least 90% identical to a sequence of at least about 10 contiguous amino acids in the 
complete amino acid sequence of a secreted protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
30 ATCC Deposit Number shown for said cDN A clone in Table 1 . 

Also preferred is a polypeptide wherein said sequence of contiguous amino 
acids is included in the amino acid sequence of a secreted portion of the secreted protein 
encoded by a human cDN A clone identified by a cDN A Clone Identifier in Table 1 and 
contained in the deposit with the ATCC Deposit Number shown for said cDNA clone in 
35 Table 1. 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to a sequence of at least about 30 contiguous amino acids in the 
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amino acid sequence of the secreted portion of the protein encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
the ATCC Deposit Number shown for said cDNA clone in Table 1. 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
5 least 95% identical to a sequence of at least about 100 contiguous amino acids in the 
amino acid sequence of the secreted portion of the protein encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
the ATCC Deposit Number shown for said cDNA clone in Table 1. 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
10 least 95% identical to the amino acid sequence of the secreted portion of the protein 
encoded by a human cDNA clone identified by a cDN A Clone Identifier in Table 1 and 
contained in the deposit with the ATCC Deposit Number shown for said cDNA clone in 
Table 1. 

Further preferred is an isolated antibody which binds specifically to a 

15 polypeptide comprising an amino acid sequence that is at least 90% identical to a 

sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is any integer as 
defined in Table 1 ; and a complete amino acid sequence of a protein encoded by a 
human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in 

20 the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Further preferred is a method for detecting in a biological sample a polypeptide 
comprising an amino acid sequence which is at least 90% identical to a sequence of at 
least 10 contiguous amino acids in a sequence selected from the group consisting of: an 
amino acid sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1; 

25 and a complete amino acid sequence of a protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1; which method 
comprises a step of comparing an amino acid sequence of at least one polypeptide 
molecule in said sample with a sequence selected from said group and determining 

30 whether the sequence of said polypeptide molecule in said sample is at least 90% 
identical to said sequence of at least 10 contiguous amino acids. 

Also preferred is the above method wherein said step of comparing an amino 
acid sequence of at least one polypeptide molecule in said sample with a sequence 
selected from said group comprises determining the extent of specific binding of 

35 polypeptides in said sample to an antibody which binds specifically to a polypeptide 

comprising an amino acid sequence that is at least 90% identical to a sequence of at least 
10 contiguous amino acids in a sequence selected from the group consisting of: an 
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amino acid sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1; 
and a complete amino acid sequence of a protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1. 
5 Also preferred is the above method wherein said step of comparing sequences is 

performed by comparing the amino acid sequence determined from a polypeptide 
molecule in said sample with said sequence selected from said group. 

Also preferred is a method for identifying the species, tissue or cell type of a 
biological sample which method comprises a step of detecting polypeptide molecules in 

10 said sample, if any, comprising an amino acid sequence that is at least 90% identical to 
a sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is any integer as 
defined in Table I ; and a complete amino acid sequence of a secreted protein encoded 
by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained 

15 in the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Also preferred is the above method for identifying the species, tissue or cell type 
of a biological sample, which method comprises a step of detecting polypeptide 
molecules comprising an amino acid sequence in a panel of at least two amino acid 
sequences, wherein at least one sequence in said panel is at least 90% identical to a 

20 sequence of at least 10 contiguous amino acids in a sequence selected from the above 
group. 

Also preferred is a method for diagnosing in a subject a pathological condition 
associated with abnormal structure or expression of a gene encoding a secreted protein 
identified in Table 1 , which method comprises a step of detecting in a biological sample 

25 obtained from said subject polypeptide molecules comprising an amino acid sequence in 
a panel of at least two amino acid sequences, wherein at least one sequence in said panel 
is at least 90% identical to a sequence of at least 10 contiguous amino acids in a 
sequence selected from the group consisting of: an amino acid sequence of SEQ ID 
NO: Y wherein Y is any integer as defined in Table 1 ; and a complete amino acid 

30 sequence of a secreted protein encoded by a human cDNA clone identified by a cDNA 
Clone Identifier in Table 1 and contained in the deposit with the ATCC Deposit Number 
shown for said cDNA clone in Table 1. 

In any of these methods, the step of detecting said polypeptide molecules 
includes using an antibody. 

35 Also preferred is an isolated nucleic acid molecule comprising a nucleotide 

sequence which is at least 95% identical to a nucleotide sequence encoding a 
polypeptide wherein said polypeptide comprises an amino acid sequence that is at least 
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90% identical to a sequence of at least 10 contiguous amino acids in a sequence selected 
from the group consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is 
any integer as defined in Table 1 ; and a complete amino acid sequence of a secreted 
protein encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 
5 1 and contained in the deposit with the ATCC Deposit Number shown for said cDNA 
clone in Table 1. 

Also preferred is an isolated nucleic acid molecule, wherein said nucleotide 
sequence encoding a polypeptide has been optimized for expression of said polypeptide 
in a prokaryotic host. 

10 Also preferred is an isolated nucleic acid molecule, wherein said polypeptide 

comprises an amino acid sequence selected from the group consisting of: an amino acid 
sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1; and a 
complete amino acid sequence of a secreted protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 

15 ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Further preferred is a method of making a recombinant vector comprising 
inserting any of the above isolated nucleic acid molecule into a vector. Also preferred is 
the recombinant vector produced by this method. Also preferred is a method of making 
a recombinant host cell comprising introducing the vector into a host cell, as well as the 

20 recombinant host cell produced by this method. 

Also preferred is a method of making an isolated polypeptide comprising 
culturing this recombinant host cell under conditions such that said polypeptide is 
expressed and recovering said polypeptide. Also preferred is this method of making an 
isolated polypeptide, wherein said recombinant host cell is a eukaryotic cell and said 

25 polypeptide is a secreted portion of a human secreted protein comprising an amino acid 
sequence selected from the group consisting of: an amino acid sequence of SEQ ID 
NO: Y beginning with the residue at the position of the First Amino Acid of the Secreted 
Portion of SEQ ID NO:Y wherein Y is an integer set forth in Table 1 and said position 
of the First Amino Acid of the Secreted Portion of SEQ ID NO:Y is defined in Table 1; 

30 and an amino acid sequence of a secreted portion of a protein encoded by a human 
cDN A clone identified by a cDNA Clone Identifier in Table 1 and contained in the 
deposit with the ATCC Deposit Number shown for said cDN A clone in Table 1 . The 
isolated polypeptide produced by this method is also preferred. 

Also preferred is a method of treatment of an individual in need of an increased 

35 level of a secreted protein activity, which method comprises administering to such an 
individual a pharmaceutical composition comprising an amount of an isolated 
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polypeptide, polynucleotide, or antibody of the claimed invention effective to increase 
the level of said protein activity in said individual. 

Having generally described the invention, the same will be more readily 
understood by reference to the following examples, which are provided by way of 
5 illustration and are not intended as limiting. 



Examples 

Example 1: Isolation of a Selected cPNA Clone From the Deposited 
Sample 

10 Each cDNA clone in a cited ATCC deposit is contained in a plasmid vector. 

Table 1 identifies the vectors used to construct the cDNA library from which each clone 
was isolated. In many cases, the vector used to construct the library is a phage vector 
from which a plasmid has been excised. The table immediately below correlates the 
related plasmid for each phage vector used in constructing the cDNA library. For 
15 example, where a particular clone is identified in Table 1 as being isolated in the vector 
"Lambda Zap," the corresponding deposited clone is in "pBluescript" 

Vector Used to Construct Library Correspond ing Deposited Plasmid 

Lambda Zap pBluescript (pBS) 

Uni-Zap XR pBluescript (pBS) 

20 Zap Express pBK 

iafmid BA plafmid BA 

pSportl pSportl 
pCMVSport 2.0 pCMVSport 2.0 

pCMVSport 3.0 pCMVSport 3.0 

25 pCR®2.1 P CR®2.1 

Vectors Lambda Zap (U.S. Patent Nos. 5,128,256 and 5,286,636), Uni-Zap 
XR (U.S. Patent Nos. 5,128, 256 and 5,286,636), Zap Express (U.S. Patent Nos. 
5,128,256 and 5,286,636), pBluescript (pBS) (Short, J. M. et al., Nucleic Acids Res. 
16:7583-7600 (1988); Alting-Mees, M. A. and Short, J. M., Nucleic Acids Res. 
30 17:9494 (1989)) and pBK (Alting-Mees, M. A. et al., Strategies 5:58-61 (1992)) are 
commercially available from Stratagene Cloning Systems, Inc., 1 101 1 N. Torrey Pines 
Road, La Jolla, CA, 92037. pBS contains an ampicillin resistance gene and pBK 
contains a neomycin resistance gene. Both can be transformed into E. coli strain XL-1 
Blue, also available from Stratagene. pBS comes in 4 forms SK+ , SK-, KS+ and KS. 
35 The S and K refers to the orientation of the polylinker to the T7 and T3 primer 

sequences which flank the polylinker region ("S" is for Sad and "K" is for Kpnl which 
are the first sites on each respective end of the linker). M +" or "- H refer to the orientation 
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of the fl origin of replication ("on"), such that in one orientation, single stranded rescue 
initiated from the f 1 ori generates sense strand DNA and in the other, antisense. 

Vectors pSportl, pCMVSport 2.0 and pCMVSport 3.0, were obtained from 
Life Technologies, Inc., P. O. Box 6009, Gaithersburg, MD 20897. All Sport vectors 
5 contain an ampicillin resistance gene and may be transformed into E. coli strain 

DH10B, also available from Life Technologies. (See, for instance, Gruber, C. E., et 
al. t Focus 15:59 (1993).) Vector lafmid BA (Bento Soares, Columbia University, NY) 
contains an ampicillin resistance gene and can be transformed into E coli strain XL-1 
Blue. Vector pCR®2.1, which is available from Invitrogen, 1600 Faraday Avenue, 
10 Carlsbad, CA 92008, contains an ampicillin resistance gene and may be transformed 
into E. coli strain DH10B, available from Life Technologies. (See, for instance, Clark, 
J. M, Nuc. Acids Res. 16:9677-9686 (1988) and Mead, D. et al. f Bio/Technology 9: 
( 199 1).) Preferably, a polynucleotide of the present invention does not comprise the 
phage vector sequences identified for the particular clone in Table 1 , as well as the 
1 5 corresponding plasmid vector sequences designated above. 

The deposited material in the sample assigned the ATCC Deposit Number cited 
in Table 1 for any given cDNA clone also may contain one or more additional plasmids, 
each comprising a cDNA clone different from that given clone. Thus, deposits sharing 
the same ATCC Deposit Number contain at least a plasmid for each cDNA clone 
20 identified in Table 1 . Typically, each ATCC deposit sample cited in Table 1 comprises 
a mixture of approximately equal amounts (by weight) of about 50 plasmid DNAs, each 
containing a different cDNA clone; but such a deposit sample may include plasmids for 
more or less than 50 cDN A clones, up to about 500 cDNA clones. 

Two approaches can be used to isolate a particular clone from the deposited 
25 sample of plasmid DNAs cited for that clone in Table 1 . First, a plasmid is directly 

isolated by screening the clones using a polynucleotide probe corresponding to SEQ ID 
NO:X. 

Particularly, a specific polynucleotide with 30-40 nucleotides is synthesized 
using an Applied Biosystems DNA synthesizer according to the sequence reported. 

30 The oligonucleotide is labeled, for instance, with 32 P-y-ATP using T4 polynucleotide 
kinase and purified according to routine methods. (E.g., Maniatis et al., Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, NY (1982).) 
The plasmid mixture is transformed into a suitable host, as indicated above (such as 
XL-1 Blue (Stratagene)) using techniques known to those of skill in the art, such as 

35 those provided by the vector supplier or in related publications or patents cited above. 
The transformants are plated on 1.5% agar plates (containing the appropriate selection 
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agent, e.g., ampicillin) to a density of about 150 transformants (colonies) per plate. 
These plates are screened using Nylon membranes according to routine methods for 
bacterial colony screening (e.g., Sambrook et al., Molecular Cloning; A Laboratory 
Manual, 2nd Edit., (1989), Cold Spring Harbor Laboratory Press, pages 1.93 to 
5 1 .104), or other techniques known to those of skill in the art. 

Alternatively, two primers of 1 7-20 nucleotides derived from both ends of the 
SEQ ID NO:X (i.e., within the region of SEQ ID NO:X bounded by the 5' NT and the 
3* NT of the clone defined in Table 1) are synthesized and used to amplify the desired 
cDNA using the deposited cDNA plasmid as a template. The polymerase chain reaction 
10 is carried out under routine conditions, for instance, in 25 jil of reaction mixture with 
0.5 ug of the above cDNA template. A convenient reaction mixture is 1.5-5 mM 
MgCl 2 , 0.01% (w/v) gelatin, 20 jiM each of dATP, dCTP, dGTP, dTTP, 25 pmol of 
each primer and 0.25 Unit of Taq polymerase. Thirty five cycles of PCR (denaturation 
at 94°C for 1 min; annealing at 55°C for 1 min; elongation at 72°C for 1 min) are 
1 5 performed with a Perkin-Elmer Cetus automated thermal cycler. The amplified product 
is analyzed by agarose gel electrophoresis and the DNA band with expected molecular 
weight is excised and purified. The PCR product is verified to be the selected sequence 
by subcloning and sequencing the DNA product. 

Several methods are available for the identification of the 5' or 3' non-coding 
20 portions of a gene which may not be present in the deposited clone. These methods 
include but are not limited to, filter probing, clone enrichment using specific probes, 
and protocols similar or identical to 5' and 3* "RACE" protocols which are well known 
in the art. For instance, a method similar to 5' RACE is available for generating the 
missing 5' end of a desired full-length transcript. (Fromont-Racine et al., Nucleic Acids 
25 Res. 21(7): 1683-1684 (1993).) 

Briefly, a specific RNA oligonucleotide is ligated to the 5' ends of a population 
of RNA presumably containing full-length gene RNA transcripts. A primer set 
containing a primer specific to the ligated RNA oligonucleotide and a primer specific to 
a known sequence of the gene of interest is used to PCR amplify the 5' portion of the 
30 desired full-length gene. This amplified product may then be sequenced and used to 
generate the full length gene. 

This above method starts with total RNA isolated from the desired source, 
although poly-A+ RNA can be used. The RNA preparation can then be treated with 
phosphatase if necessary to eliminate 5' phosphate groups on degraded or damaged 
35 RNA which may interfere with the later RNA ligase step. The phosphatase should then 
be inactivated and the RNA treated with tobacco acid pyrophosphatase in order to 
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remove the cap structure present at the 5' ends of messenger RNAs. This reaction 
leaves a 5' phosphate group at the 5' end of the cap cleaved RNA which can then be 
ligated to an RNA oligonucleotide using T4 RNA ligase. 

This modified RNA preparation is used as a template for first strand cDNA 
5 synthesis using a gene specific oligonucleotide. The first strand synthesis reaction is 
used as a template for PCR amplification of the desired 5* end using a primer specific to 
the ligated RNA oligonucleotide and a primer specific to the known sequence of the 
gene of interest. The resultant product is then sequenced and analyzed to confirm that 
the 5' end sequence belongs to the desired gene. 

10 

Example 2: Isolation of Genomic Clones Corresponding to a 
Polynucleotide 

A human genomic PI library (Genomic Systems, Inc.) is screened by PCR 
using primers selected for the cDNA sequence corresponding to SEQ ID NO:X M 
15 according to the method described in Example 1. (See also, SambrooL) 

Example 3: Tissue Distribution of Polypeptide 

Tissue distribution of mRNA expression of polynucleotides of the present 
invention is determined using protocols for Northern blot analysis, described by, 
20 among others, Sambrook et al. For example, a cDN A probe produced by the method 
described in Example 1 is labeled with P 32 using the rediprime™ DNA labeling system 
(Amersham Life Science), according to manufacturer's instructions. After labeling, the 
probe is purified using CHROMA SPIN- 100™ column (Clontech Laboratories, Inc.), 
according to manufacturer's protocol number PT1200-1 . The purified labeled probe is 
25 then used to examine various human tissues for mRNA expression. 

Multiple Tissue Northern (MTN) blots containing various human tissues (H) or 
human immune system tissues (IM) (Clontech) are examined with the labeled probe 
using ExpressHyb™ hybridization solution (Clontech) according to manufacturer's 
protocol number PT1 190- 1 . Following hybridization and washing, the blots are 

30 mounted and exposed to film at -70°C overnight, and the films developed according to 

standard procedures. 

Example 4: C hromosomal Map pin g of the Polynucleotides 

An oligonucleotide primer set is designed according to the sequence at the 5* 
35 end of SEQ ID NO:X. This primer preferably spans about 100 nucleotides. This 
primer set is then used in a polymerase chain reaction under the following set of 
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conditions : 30 seconds, 95°C; 1 minute, 56°C; 1 minute, 70°C. This cycle is repeated 

32 times followed by one 5 minute cycle at 70°C Human, mouse, and hamster DNA 

is used as template in addition to a somatic cell hybrid panel containing individual 
chromosomes or chromosome fragments (Bios, Inc). The reactions is analyzed on 
5 either 8% polyacrylamide gels or 3.5 % agarose gels. Chromosome mapping is 

determined by the presence of an approximately 100 bp PCR fragment in the particular 
somatic cell hybrid. 

Example 5: Bacterial Expression of a Polypeptide 

10 A polynucleotide encoding a polypeptide of the present invention is amplified 

using PCR oligonucleotide primers corresponding to the 5' and 3* ends of the DNA 
sequence, as outlined in Example 1, to synthesize insertion fragments. The primers 
used to amplify the cDNA insert should preferably contain restriction sites, such as 
BamHI and Xbal, at the 5' end of the primers in order to clone the amplified product 

1 5 into the expression vector. For example, BamHI and Xbal correspond to the restriction 
enzyme sites on the bacterial expression vector pQE-9. (Qiagen, Inc., Chatsworth, 

CA). This plasmid vector encodes antibiotic resistance (Amp 1 ), a bacterial origin of 
replication (ori), an IPTG-regulatable promoter/operator (P/O), a ribosome binding site 
(RBS), a 6-histidine tag (6-His), and restriction enzyme cloning sites. 
20 The pQE-9 vector is digested with BamHI and Xbal and the amplified fragment 

is ligated into the pQE-9 vector maintaining the reading frame initiated at the bacterial 
RBS. The ligation mixture is then used to transform the E. coli strain M15/rep4 
(Qiagen, Inc.) which contains multiple copies of the plasmid pREP4, which expresses 

the lad repressor and also confers kanamycin resistance (Kan 1 *). Transformants are 
25 identified by their ability to grow on LB plates and ampicillin/kanamycin resistant 

colonies are selected. Plasmid DNA is isolated and confirmed by restriction analysis. 
Clones containing the desired constructs are grown overnight (O/N) in liquid 

culture in LB media supplemented with both Amp (100 ug/ml) and Kan (25 ug/ml). 

The O/N culture is used to inoculate a large culture at a ratio of 1 : 100 to 1 :250. The 
30 cells are grown to an optical density 600 (O.D. 600 ) of between 0.4 and 0.6. IPTG 

(Isopropyl-B-D-thiogalacto pyranoside) is then added to a final concentration of 1 mM. 

IPTG induces by inactivating the lad repressor, clearing the P/O leading to increased 

gene expression. 

Cells are grown for an extra 3 to 4 hours. Cells are then harvested by 
35 centrifugation (20 mins at 6000Xg). The cell pellet is solubilized in the chaotropic 
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agent 6 Molar Guanidine HC1 by stirring for 3-4 hours at 4°C. The cell debris is 

removed by centrifugation, and the supernatant containing the polypeptide is loaded 
onto a nickcl-nitrilo-tri-acetic acid ("Ni-NTA") affinity resin column (available from 
QIAGEN, Inc., supra). Proteins with a 6 x His tag bind to the Ni-NTA resin with high 
5 affinity and can be purified in a simple one-step procedure (for details see; The 
QIAexpressionist (1995) QIAGEN, Inc., supra). 

Briefly, the supernatant is loaded onto the column in 6 M guanidine-HCl, pH 8, 
the column is first washed with 10 volumes of 6 M guanidine-HCl, pH 8, then washed 
with 10 volumes of 6 M guanidine-HCl pH 6, and finally the polypeptide is cluted with 

10 6 M guanidine-HCl, pH 5. 

The purified protein is then renatured by dialyzing it against phosphate-buffered 
saline (PBS) or 50 mM Na-acetate, pH 6 buffer plus 200 mM NaCl. Alternatively, the 
protein can be successfully refolded while immobilized on the Ni-NTA column. The 
recommended conditions are as follows: renature using a linear 6M-1M urea gradient in 

15 500 mM NaCl, 20% glycerol, 20 mM Tris/HCl pH 7.4, containing protease inhibitors. 
The renaturation should be performed over a period of 1.5 hours or more. After 
renaturation the proteins are eluted by the addition of 250 mM immidazole. Immidazole 
is removed by a final dialyzing step against PBS or 50 mM sodium acetate pH 6 buffer 
plus 200 mM NaCl. The purified protein is stored at 4° C or frozen at -80° C. 

20 In addition to the above expression vector, the present invention further includes 

an expression vector comprising phage operator and promoter elements operatively 
linked to a polynucleotide of the present invention, called pHE4a. (ATCC Accession 
Number XXXXXX.) This vector contains: 1) a neomycinphosphotransferase gene as 
a selection marker, 2) an E. coli origin of replication, 3) a T5 phage promoter sequence, 

25 4) two lac operator sequences, 5) a Shine-Delgamo sequence, and 6) the lactose operon 
repressor gene (laclq). The origin of replication (oriC) is derived from pUC19 (LTI, 
Gaithersburg, MD). The promoter sequence and operator sequences are made 
synthetically. 

DNA can be inserted into the pHEa by restricting the vector with Ndel and 
30 Xbal, BamHI, Xhol, or Asp718, running the restricted product on a gel, and isolating 
the larger fragment (the stuffer fragment should be about 3 10 base pairs). The DNA 
insert is generated according to the PCR protocol described in Example 1, using PCR 
primers having restriction sites for Ndel (5' primer) and Xbal, BamHI, Xhol, or 
Asp718 (3* primer). The PCR insert is gel purified and restricted with compatible 
35 enzymes. The insert and vector are ligated according to standard protocols. 
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The engineered vector could easily be substituted in the above protocol to 
express protein in a bacterial system. 

Example 6: Purification of a Polypeptide from an Inclusion Body 

5 The following alternative method can be used to purify a polypeptide expressed 

in E coli when it is present in the form of inclusion bodies. Unless otherwise specified, 

all of the following steps are conducted at 4-10°C. 

Upon completion of the production phase of the £. coli fermentation, the cell 

culture is cooled to 4-10°C and the cells harvested by continuous centrifugation at 

10 15,000 rpm (Heraeus Sepatech). On the basis of the expected yield of protein per unit 
weight of cell paste and the amount of purified protein required, an appropriate amount 
of cell paste, by weight, is suspended in a buffer solution containing 100 mM Tris, 50 
mM EDTA, pH 7.4. The cells are dispersed to a homogeneous suspension using a 
high shear mixer. 

15 The cells are then lysed by passing the solution through a microfluidizer 

(Microfuidics, Corp. or APV Gaulin, Inc.) twice at 4000-6000 psi. The homogenate is 
then mixed with NaCl solution to a final concentration of 0.5 M NaCl, followed by 
centrifugation at 7000 xg for 15 min. The resultant pellet is washed again using 0.5M 
NaCl, 100 mM Tris, 50 mM EDTA, pH 7.4. 

20 The resulting washed inclusion bodies are solubilized with 1 .5 M guanidine 

hydrochloride (GuHCl) for 2-4 hours. After 7000 xg centrifugation for 15 min., the 

pellet is discarded and the polypeptide containing supernatant is incubated at 4°C 

overnight to allow further GuHCl extraction. 

Following high speed centrifugation (30,000 xg) to remove insoluble particles, 
25 the GuHCl solubilized protein is refolded by quickly mixing the GuHCl extract with 20 
volumes of buffer containing 50 mM sodium, pH 4.5, 1 50 mM NaCl, 2 mM EDTA by 

vigorous stirring. The refolded diluted protein solution is kept at 4°C without mixing 

for 12 hours prior to further purification steps. 

To clarify the refolded polypeptide solution, a previously prepared tangential 

30 filtration unit equipped with 0. 1 6 |im membrane filter with appropriate surface area 

(e.g., Filtron), equilibrated with 40 mM sodium acetate, pH 6.0 is employed. The 
filtered sample is loaded onto a cation exchange resin (e.g., Poros HS-50, Perseptive 
Biosystems). The column is washed with 40 mM sodium acetate, pH 6.0 and eluted 
with 250 mM, 500 mM, 1000 mM, and 1500 mM NaCl in the same buffer, in a 
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stepwise manner. The absorbance at 280 nm of the effluent is continuously monitored. 

Fractions are collected and further analyzed by SDS-PAGE. 

Fractions containing the polypeptide are then pooled and mixed with 4 volumes 

of water. The diluted sample is then loaded onto a previously prepared set of tandem 
5 columns of strong anion (Poros HQ-50, Perseptive Biosystems) and weak anion 

(Poros CM-20, Perseptive Biosystems) exchange resins. The columns are equilibrated 

with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 mM sodium 

acetate, pH 6.0, 200 mM NaCl. The CM-20 column is then eluted using a 10 column 

volume linear gradient ranging from 0.2 M NaCl, 50 mM sodium acetate, pH 6.0 to 1 .0 
0 M NaCl, 50 mM sodium acetate, pH 6.5. Fractions are collected under constant A 280 

monitoring of the effluent. Fractions containing the polypeptide (determined, for 

instance, by 16% SDS-PAGE) are then pooled. 

The resultant polypeptide should exhibit greater than 95% purity after the above 

refolding and purification steps. No major contaminant bands should be observed from 

5 Commassie blue stained 16% SDS-PAGE gel when 5 \ig of purified protein is loaded. 

The purified protein can also be tested for endotoxin/LPS contamination, and typically 
the LPS content is less than 0.1 ng/rnl according to LAL assays. 



Example 7: Cloning and Expression of a Polypeptide in a Baculovirus 
Expression System 

In this example, the plasmid shuttle vector pA2 is used to insert a polynucleotide 
into a baculovirus to express a polypeptide. This expression vector contains the strong 
polyhedrin promoter of the Autographa californica nuclear polyhedrosis virus 
(AcMNPV) followed by convenient restriction sites such as BamHI, Xba I and 
Asp718. The polyadenylation site of the simian virus 40 ("SV40 n ) is used for efficient 
polyadenylation. For easy selection of recombinant virus, the plasmid contains the 
beta-galactosidase gene from E. coli under control of a weak Drosophila promoter in the 
same orientation, followed by the polyadenylation signal of the polyhedrin gene. The 
inserted genes are flanked on both sides by viral sequences for cell-mediated 
homologous recombination with wild-type viral DNA to generate a viable virus that 
express the cloned polynucleotide. 

Many other baculovirus vectors can be used in place of the vector above, such 
as pAc373, pVL941, and pAcIMl, as one skilled in the art would readily appreciate, as 
long as the construct provides appropriately located signals for transcription, 
translation, secretion and the like, including a signal peptide and an in-frame AUG as 
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required. Such vectors are described, for instance, in Luckow et al., Virology 170:31- 
39 (1989). 

Specifically, the cDNA sequence contained in the deposited clone, including the 
AUG initiation codon and the naturally associated leader sequence identified in Table 1, 
5 is amplified using the PCR protocol described in Example 1. If the naturally occurring 
signal sequence is used to produce the secreted protein, the pA2 vector does not need a 
second signal peptide. Alternatively, the vector can be modified (pA2 GP) to include a 
baculovirus leader sequence, using the standard methods described in Summers et al., 
"A Manual of Methods for Baculovirus Vectors and Insect Cell Culture Procedures," 
10 Texas Agricultural Experimental Station Bulletin No. 1555 (1987). 

The amplified fragment is isolated from a 1% agarose gel using a commercially 
available kit ("Geneclean," BIO 101 Inc., La Jolla, Ca.). The fragment then is digested 
with appropriate restriction enzymes and again purified on a 1% agarose gel. 

The plasmid is digested with the corresponding restriction enzymes and 
15 optionally, can be dephosphorylated using calf intestinal phosphatase, using routine 
procedures known in the art. The DNA is then isolated from a 1 % agarose gel using a 
commercially available kit ("Geneclean" BIO 101 Inc., La Jolla, Ca.). 

The fragment and the dephosphorylated plasmid are ligated together with T4 
DNA ligase. E, coli HB101 or other suitable £. coli hosts such as XL-1 Blue 
20 (Stratagene Cloning Systems, La Jolla, CA) cells are transformed with the ligation 

mixture and spread on culture plates. Bacteria containing the plasmid are identified by 
digesting DNA from individual colonies and analyzing the digestion product by gel 
electrophoresis. The sequence of the cloned fragment is confirmed by DNA 
sequencing. 

25 Five |ig of a plasmid containing the polynucleotide is co-transfected with 1 .0 \lg 

of a commercially available linearized baculovirus DNA ("BaculoGold™ baculovirus 
DNA M , Pharmingen, San Diego, CA), using the lipofection method described by 
Feigner et al., Proc. Natl. Acad. Sci. USA 84:7413-7417 (1987). One \ig of 
BaculoGold™ virus DNA and 5 |ig of the plasmid are mixed in a sterile well of a 

30 microtiter plate containing 50 fil of serum-free Grace's medium (Life Technologies 

Inc., Gaithersburg, MD). Afterwards, 10 |il Lipofectin plus 90 |il Grace's medium are 
added, mixed and incubated for 15 minutes at room temperature. Then the transfection 
mixture is added drop-wise to Sf9 insect cells (ATCC CRL 1711) seeded in a 35 mm 
tissue culture plate with 1 ml Grace's medium without serum. The plate is then 

35 incubated for 5 hours at 27° C. The transfection solution is then removed from the plate 
and 1 ml of Grace's insect medium supplemented with 10% fetal calf serum is added. 
Cultivation is then continued at 27° C for four days. 
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After four days the supernatant is collected and a plaque assay is performed, as 
described by Summers and Smith, supra. An agarose gel with "Blue Gal" (Life 
Technologies Inc., Gaithersburg) is used to allow easy identification and isolation of 
gal-expressing clones, which produce blue-stained plaques. (A detailed description of a 
5 "plaque assay" of this type can also be found in the user's guide for insect cell culture 
and baculovirology distributed by Life Technologies Inc., Gaithersburg, page 9-10.) 
After appropriate incubation, blue stained plaques are picked with the tip of a 
micropipettor (e.g., Eppendorf). The agar containing the recombinant viruses is then 
resuspended in a microcentrifuge tube containing 200 \i\ of Grace's medium and the 

10 suspension containing the recombinant baculovirus is used to infect Sf9 cells seeded in 
35 mm dishes. Four days later the supernatants of these culture dishes are harvested 
and then they are stored at 4° C. 

To verify the expression of the polypeptide, Sf9 cells are grown in Grace's 
medium supplemented with 10% heat-inactivated FBS. The cells are infected with the 

1 5 recombinant baculovirus containing the polynucleotide at a multiplicity of infection 
("MOI") of about 2. If radiolabeled proteins are desired, 6 hours later the medium is 
removed and is replaced with SF900 II medium minus methionine and cysteine 
(available from Life Technologies Inc., Rockville, MD). After 42 hours, 5 jiCi of 35 S- 
methionine and 5 \iC\ 35 S-cysteine (available from Amersham) are added. The cells are 

20 further incubated for 16 hours and then are harvested by centrifugation. The proteins in 
the supernatant as well as the intracellular proteins are analyzed by SDS-PAGE 
followed by autoradiography (if radiolabeled). 

Microsequencing of the amino acid sequence of the amino terminus of purified 
protein may be used to determine the amino terminal sequence of the produced protein. 

25 Example 8: Expressi on of a Polypeptide in Mammalian Cells 

The polypeptide of the present invention can be expressed in a mammalian cell. 
A typical mammalian expression vector contains a promoter element, which mediates 
the initiation of transcription of mRNA, a protein coding sequence, and signals required 
for the termination of transcription and polyadenylation of the transcript. Additional 

30 elements include enhancers, Kozak sequences and intervening sequences flanked by 
donor and acceptor sites for RNA splicing. Highly efficient transcription is achieved 
with the early and late promoters from SV40, the long terminal repeats (LTRs) from 
Retroviruses, e.g., RSV, HTLVI, HIVI and the early promoter of the cytomegalovirus 
(CMV). However, cellular elements can also be used (e.g., the human actin promoter). 

35 Suitable expression vectors for use in practicing the present invention include, 

for example, vectors such as pSVL and pMSG (Pharmacia, Uppsala, Sweden), 
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pRSVcat (ATCC 37152), pSV2dhfr (ATCC 37146), pBC12MI (ATCC 67109), 
pCMVSport 2.0, and pCMVSport 3.0. Mammalian host cells that could be used 
include, human Hela, 293, H9 and Jurkat cells, mouse NIH3T3 and CI 27 cells, Cos 1, 
Cos 7 and CV1, quail QC1-3 cells, mouse L cells and Chinese hamster ovary (CHO) 
5 cells. 

Alternatively, the polypeptide can be expressed in stable cell lines containing the 
polynucleotide integrated into a chromosome. The co-transfection with a selectable 
marker such as dhfr, gpt, neomycin, hygromycin allows the identification and isolation 
of the transfected cells. 

10 The transfected gene can also be amplified to express large amounts of the 

encoded protein. The DHFR (dihydrofolate reductase) marker is useful in developing 
cell lines that carry several hundred or even several thousand copies of the gene of 
interest. (See, e.g., Alt, F. W„ et al., J. Biol. Chem. 253:1357-1370 (1978); Hamlin, 
J. L. and Ma, C, Biochem. et Biophys. Acta, 1097:107-143 (1990); Page, M. J. and 

15 Sydenham, M. A., Biotechnology 9:64-68 (1991).) Another useful selection marker is 
the enzyme glutamine synthase (GS) (Murphy et al., Biochem J. 227:277-279 (1991); 
Bebbington et al., Bio/Technology 10:169-175 (1992). Using these markers, the 
mammalian cells are grown in selective medium and the cells with the highest resistance 
are selected. These cell lines contain the amplified gene(s) integrated into a 

20 chromosome. Chinese hamster ovary (CHO) and NSO cells are often used for the 
production of proteins. 

Derivatives of the plasmid pSV2-dhfr (ATCC Accession No. 37146), the 
expression vectors pC4 (ATCC Accession No. 209646) and pC6 (ATCC Accession 
No. 209647) contain the strong promoter (LTR) of the Rous Sarcoma Virus (Cullen et 

25 al., Molecular and Cellular Biology, 438-447 (March, 1985)) plus a fragment of the 
CMV-enhancer (Boshart et al., Cell 41:521-530 (1985).) Multiple cloning sites, e.g., 
with the restriction enzyme cleavage sites BamHI, Xbal and Asp718, facilitate the 
cloning of the gene of interest. The vectors also contain the 3' intron, the 
polyadenylation and termination signal of the rat preproinsulin gene, and the mouse 

30 DHFR gene under control of the S V40 early promoter. 

Specifically, the plasrnid pC6, for example, is digested with appropriate 
restriction enzymes and then dephosphorylated using calf intestinal phosphates by 
procedures known in the art. The vector is then isolated from a 1% agarose gel. 

A polynucleotide of the present invention is amplified according to the protocol 

35 outlined in Example 1. If the naturally occurring signal sequence is used to produce the 
secreted protein, the vector does not need a second signal peptide. Alternatively, if the 
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naturally occurring signal sequence is not used, the vector can be modified to include a 
heterologous signal sequence. (See, e.g., WO 96/34891.) 

The amplified fragment is isolated from a 1% agarose gel using a commercially 
available kit ("Geneclean." BIO 101 Inc., La Jolla, Ca.). The fragment then is digested 

5 with appropriate restriction enzymes and again purified on a 1% agarose gel. 

The amplified fragment is then digested with the same restriction enzyme and 
purified on a 1% agarose gel. The isolated fragment and the dephosphorylated vector 
are then ligated with T4 DNA ligase. £. coli HB101 or XL-1 Blue cells are then 
transformed and bacteria are identified that contain the fragment inserted into plasmid 

10 pC6 using, for instance, restriction enzyme analysis. 

Chinese hamster ovary cells lacking an active DHFR gene is used for 
transfection. Five \ig of the expression plasmid pC6 is cotransfected with 0.5 jig of the 
plasmid pSVneo using lipofectin (Feigner et al., supra). The plasmid pSV2-neo 
contains a dominant selectable marker, the neo gene from Tn5 encoding an enzyme that 

1 5 confers resistance to a group of antibiotics including G4 1 8. The cells are seeded in 
alpha minus MEM supplemented with 1 mg/ml G418. After 2 days, the cells are 
trypsinized and seeded in hybridoma cloning plates (Greiner, Germany) in alpha minus 
MEM supplemented with 10, 25, or 50 ng/ml of metothrexate plus 1 mg/ml G418. 
After about 10-14 days single clones are trypsinized and then seeded in 6- well petri 

20 dishes or 10 ml flasks using different concentrations of methotrexate (50 nM, 100 nM, 
200 nM, 400 nM, 800 nM). Clones growing at the highest concentrations of 
methotrexate are then transferred to new 6-well plates containing even higher 
concentrations of methotrexate ( 1 p.M, 2 jiM, 5 ^lM, 10 mM, 20 mM). The same 
procedure is repeated until clones are obtained which grow at a concentration of 100 - 

25 200 |lM. Expression of the desired gene product is analyzed, for instance, by SDS- 
PAGE and Western blot or by reversed phase HPLC analysis. 

Example 9: Protein Fusions 

The polypeptides of the present invention are preferably fused to other proteins. 

30 These fusion proteins can be used for a variety of applications. For example, fusion of 
the present polypeptides to His-tag, H A-tag, protein A, IgG domains, and maltose 
binding protein facilitates purification. (See Example 5; see also EP A 394,827; 
Traunecker, et al., Nature 331:84-86 (1988).) Similarly, fusion to IgG-1, IgG-3, and 
albumin increases the halflife time in vivo. Nuclear localization signals fused to the 

35 polypeptides of the present invention can target the protein to a specific subcellular 
localization, while covalent heterodimer or homodimers can increase or decrease the 
activity of a fusion protein. Fusion proteins can also create chimeric molecules having 
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more than one function. Finally, fusion proteins can increase solubility and/or stability 
of the fused protein compared to the non-fused protein. All of the types of fusion 
proteins described above can be made by modifying the following protocol, which 
outlines the fusion of a polypeptide to an IgG molecule, or the protocol described in 
5 Example 5. 

Briefly, the human Fc portion of the IgG molecule can be PCR amplified, using 
primers that span the 5' and 3' ends of the sequence described below. These primers 
also should have convenient restriction enzyme sites that will facilitate cloning into an 
expression vector, preferably a mammalian expression vector. 

10 For example, if pC4 (Accession No.209646) is used, the human Fc portion can 

be ligated into the BamHI cloning site. Note that the 3* BamHI site should be 
destroyed. Next, the vector containing the human Fc portion is re-restricted with 
BamHI, linearizing the vector, and a polynucleotide of the present invention, isolated 
by the PCR protocol described in Example 1 , is ligated into this BamHI site. Note that 

15 the polynucleotide is cloned without a stop codon, otherwise a fusion protein will not 
be produced. 

If the naturally occurring signal sequence is used to produce the secreted 
protein, pC4 does not need a second signal peptide. Alternatively, if the naturally 
occurring signal sequence is not used, the vector can be modified to include a 
20 heterologous signal sequence. (See, e.g., WO 96/3489 1.) 

Human IgG Fc region: 

GCKiATCCGGAGCCCAAATCTTCTGACAAAACTCACACATGCCCACCGTGCC 
CAGCACCTGAATTCGAGCKTTGCACCGTC 

25 CAAGGACACCCTCATGATCTCCCGGACTCCTGAGGTCACATGCGTGGTGGT 
GGACGTAAGCCACGAAGACCCTGAGGTCAAGTTCAACTGGTACGTGGACG 
GCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTACAAC 
AGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTG 
AATGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGCCCTCCCAACCCCC 

30 ATCGAGAAAACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAACCACAGGT 
GTACACCCTGCCCCCATCCCGGGATGAGCTGACCAAGAACCAGGTCAGCCT 
GACCTGCCTGGTCAAAGGCTTCTATCCAAGCGACATCGCCGTGGAGTGGGA 
GAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGG 
ACTCCGACGGCTCCTTCTTCCTCTACAGCAAGCTCACCGTGGACAAGAGCA 

35 GGTGGCAGCAGGGGAACGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGC 
ACAACCACTACACGCAGAAGAGCCTCTCCCTGTCTCCGGGTAAATGAGTGC 
GACGGCCGCGACTCTAGAGGAT (SEQ ID NO: 1) 
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Example 10: Production of an Antibody from a Polypeptide 

The antibodies of the present invention can be prepared by a variety of methods. 
(See, Current Protocols, Chapter 2.) For example, cells expressing a polypeptide of 

5 the present invention is administered to an animal to induce the production of sera 
containing polyclonal antibodies. In a preferred method, a preparation of the secreted 
protein is prepared and purified to render it substantially free of natural contaminants. 
Such a preparation is then introduced into an animal in order to produce polyclonal 
antisera of greater specific activity. 

0 In the most preferred method, the antibodies of the present invention are 

monoclonal antibodies (or protein binding fragments thereof). Such monoclonal 
antibodies can be prepared using hybridoma technology. (Kohler et al., Nature 
256:495 (1975); Kohler et al., Eur. J. Immunol. 6:51 1 (1976); Kohler et aL, Eur. J. 
Immunol. 6:292 (1976); Hammerling et al., in: Monoclonal Antibodies andT-Cell 

5 Hybridomas, Elsevier, N.Y., pp. 563-681 (1981).) In general, such procedures 
involve immunizing an animal (preferably a mouse) with polypeptide or, more 
preferably, with a secreted polypeptide-expressing cell. Such cells may be cultured in 
any suitable tissue culture medium; however, it is preferable to culture cells in Earle's 
modified Eagle's medium supplemented with 10% fetal bovine serum (inactivated at 

0 about 56°C), and supplemented with about 10 g/1 of nonessential amino acids, about 

1,000 U/ml of penicillin, and about 100 p.g/ml of streptomycin. 

The splenocytes of such mice are extracted and fused with a suitable myeloma 
cell line. Any suitable myeloma cell line may be employed in accordance with the 
present invention; however, it is preferable to employ the parent myeloma cell line 

5 (SP20), available from the ATCC. After fusion, the resulting hybridoma cells are 
selectively maintained in HAT medium, and then cloned by limiting dilution as 
described by Wands et al. (Gastroenterology 80:225-232 (1981).) The hybridoma cells 
obtained through such a selection are then assayed to identify clones which secrete 
antibodies capable of binding the polypeptide. 

0 Alternatively, additional antibodies capable of binding to the polypeptide can be 

produced in a two-step procedure using anti-idiotypic antibodies. Such a method 
makes use of the fact that antibodies are themselves antigens, and therefore, it is 
possible to obtain an antibody which binds to a second antibody. In accordance with 
this method, protein specific antibodies are used to immunize an animal, preferably a 

5 mouse. The splenocytes of such an animal are then used to produce hybridoma cells, 
and the hybridoma cells are screened to identify clones which produce an antibody 
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whose ability to bind to the protein-specific antibody can be blocked by the polypeptide. 
Such antibodies comprise anti-idiotypic antibodies to the protein-specific antibody and 
can be used to immunize an animal to induce formation of further protein-specific 
antibodies. 

5 It will be appreciated that Fab and F(ab')2 and other fragments of the antibodies 

of the present invention may be used according to the methods disclosed herein. Such 
fragments are typically produced by proteolytic cleavage, using enzymes such as papain 
(to produce Fab fragments) or pepsin (to produce F(ab')2 fragments). Alternatively, 
secreted protein-binding fragments can be produced through the application of 

10 recombinant DNA technology or through synthetic chemistry. 

For in vivo use of antibodies in humans, it may be preferable to use 
"humanized" chimeric monoclonal antibodies. Such antibodies can be produced using 
genetic constructs derived from hybridoma cells producing the monoclonal antibodies 
described above. Methods for producing chimeric antibodies are known in the art. 

15 (See, for review, Morrison, Science 229:1202 (1985); Oi et al., BioTechniques 4:214 
(1986); Cabilly et al., U.S. Patent No. 4,816,567; Taniguchi et al., EP 171496; 
Morrison et al., EP 173494; Neuberger et al., WO 8601533; Robinson et al., WO 
8702671; Boulianne ct al., Nature 312:643 (1984); Neuberger et al., Nature 314:268 
(1985).) 

20 

Example 11: Production Of Secreted Protein For High-Throughput 

Screening Assays 

The following protocol produces a supernatant containing a polypeptide to be 

tested. This supernatant can then be used in the Screening Assays described in 
25 Examples 13-20. 

First, dilute Poly-D-Lysine (644 587 Boehringer-Mannheim) stock solution 

(lmg/ml in PBS) 1:20 in PBS (w/o calcium or magnesium 17-516F Biowhittaker) for a 

working solution of 50ug/ml. Add 200 ul of this solution to each well (24 well plates) 

and incubate at RT for 20 minutes. Be sure to distribute the solution over each well 
30 (note: a 12-channel pipetter may be used with tips on every other channel). Aspirate off 

the Poly-D-Lysine solution and rinse with 1ml PBS (Phosphate Buffered Saline). The 

PBS should remain in the well until just prior to plating the cells and plates may be 

poly-lysine coated in advance for up to two weeks. 

Plate 293T cells (do not carry cells past P+20) at 2 x 10 5 cells/well in .5ml 
35 DMEM(Dulbecco's Modified Eagle Medium)(with 4.5 G/L glucose and L-glutamine 

(12-604F Biowhittaker))/10% heat inactivated FBS(14-503F Biowhittaker)/ Ix 

Penstrep(17-602E Biowhittaker). Let the cells grow overnight. 
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The next day, mix together in a sterile solution basin: 300 ul Lipofectamine 
(18324-012 Gibco/BRL) and 5ml Optimem I (31985070 Gibco/BRL)/96-well plate. 
With a small volume multi-channel pipetter, aliquot approximately 2ug of an expression 
vector containing a polynucleotide insert, produced by the methods described in 

5 Examples 8 or 9, into an appropriately labeled 96-well round bottom plate. With a 
multi-channel pipetter, add 50ul of the Lipofectamine/Optimem I mixture to each well. 
Pipette up and down gently to mix. Incubate at RT 15-45 minutes. After about 20 
minutes, use a multi-channel pipetter to add 150ul Optimem I to each well. As a 
control, one plate of vector DNA lacking an insert should be transfected with each set of 

0 transfections. 

Preferably, the transfection should be performed by tag-teaming the following 
tasks. By tag-teaming, hands on time is cut in half, and the cells do not spend too 
much time on PBS. First, person A aspirates off the media from four 24- well plates of 
cells, and then person B rinses each well with .5- lml PBS. Person A then aspirates off 
5 PBS rinse, and person B, using al2-channel pipetter with tips on every other channel, 
adds the 200ul of DNA/Lipofectamine/Optimem I complex to the odd wells first, then to 

the even wells, to each row on the 24-well plates. Incubate at 37°C for 6 hours. 

While cells are incubating, prepare appropriate media, either 1%BSA in DMEM 
with lx penstrep, or CHO-5 media (see below) with 2mm glutamine and lx penstrep. 
0 (BSA (81-068-3 Bayer) lOOgm dissolved in 1L DMEM for a 10% BSA stock 

solution). Filter the media and collect 50 ul for endotoxin assay in 15ml polystyrene 
conical. 

The transfection reaction is terminated, preferably by tag-teaming, at the end of 
the incubation period. Person A aspirates off the transfection media, while person B 

5 adds 1 .5ml appropriate media to each well. Incubate at 37°C for 45 or 72 hours 

depending on the media used: 1 %BSA for 45 hours or CHO-5 for 72 hours. 

On day four, using a 300ul multichannel pipetter, aliquot 600ul in one lrnl deep 
well plate and the remaining supernatant into a 2ml deep well. The supernatants from 
each well can then be used in the assays described in Examples 13-20. 

0 It is specifically understood that when activity is obtained in any of the assays 

described below using a supernatant, the activity originates from either the polypeptide 
directly (e.g., as a secreted protein) or by the polypeptide inducing expression of other 
proteins, which are then secreted into the supernatant. Thus, the invention further 
provides a method of identifying the protein in the supernatant characterized by an 

5 activity in a particular assay. 
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HGS-CHO-5 medium formulation; 
Inorganic Salts 



CaC12 (anhyd) 


116.6 mg/L 


CuS0 4 -5H,0 


0.00130 


Fe(NO,),-9H,0 


0.050 


FeS0 4 -7H,0 


0.417 


KC1 


311.80 


MgCl, 


28.64 


MgSO, 


48.84 


NaCl 


6995.50 


NaHCO, 


2400.0 


NaH,PO 4 -H,0 


62.50 


Na,HP04 


71.02 


ZnS0 4 -7H,0 


.4320 



5 Lipids 



Arachidonic Acid 


.002 mg/L 


Cholesterol 


1.022 


DL-alpha- 
Tocopherol-Acetate 


.070 


Linoleic Acid 


0.0520 


Linolenic Acid 


0.010 


Myristic Acid 


0.010 


Oleic Acid 


0.010 


Palmitric Acid 


0.010 


Palmitic Acid 


0.010 


Pluronic F-68 


100 


Stearic Acid 


0.010 


Tween 80 


2.20 


Carbon Source 


D-Glucose 4551 mg/L 


Amino Acids 


L- Alanine 


130.85 mg/ml 


L-Arginine-HCL 


147.50 


L-Asparagine-H.,0 


7.50 


L-Aspartic Acid 


6.65 


L-Cystine-2HCL- 
H,Q 


29.56 


L-Cystine-2HCL 


31.29 


L-Glutamic Acid 


7.35 


L-Glutamine 


365.0 


Glycine 


18.75 


L-Histidine-HCL- 


52.48 
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H,0 




L-Isoleucine 


106.97 


L-Leucine 


111.45 


L-Lysine HCL 


163.75 


L-Methionine 


32.34 


L-Phcnylalainine 


68.48 


L-Proline 


40.0 


L-Serine 


26.25 


L-Threonine 


101.05 


L-Tryptophan 


19.22 


L-Tryrosine-2Na- 
2H,0 


91.79 


L-Valine 


99.65 



Vitamins 



Biotin 


0.0035 mg/L 


D-Ca Pantothenate 


3.24 


Choline Chloride 


11.78 


Folic Acid 


4.65 


i-Inositol 


15.60 


Niacinamide 


3.02 


Pyridoxal HCL 


3.00 


Pyridoxine HCL 


0.031 


Riboflavin 


0.319 


Thiamine HCL 


3.17 


Thymidine 


0.365 


Vitamin B 17 


0.680 



Other Components 



HEPES Buffer 


25 mM 


Na Hypoxanthine 


2.39 mg/L 


Lipoic Acid 


0.105 


Sodium Putrescine-2HCL 


0.081 


Sodium Pyruvate 


55.0 


Sodium Selenite 


0.0067 


Ethanolamine 


20uM 


Ferric Citrate 


0.122 


Methyl-B-Cyclodextrin cornplexed with 
Linoleic Acid 


41.70 


Methyl-B-Cyclodextrin complcxcd with 
Oleic Acid 


33.33 


Methyl-B-Cyclodextrin cornplexed with 
Retinal Acetate 


10 



5 



Adjust os molarity to 327 mOsm 
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Example 12: Construction of GAS Reporter Construct 

One signal transduction pathway involved in the differentiation and proliferation 
of cells is called the Jaks-STATs pathway. Activated proteins in the Jaks-STATs 
pathway bind to gamma activation site "GAS" elements or interferon-sensitive 
5 responsive element ("ISRE"), located in the promoter of many genes. The binding of a 
protein to these elements alter the expression of the associated gene. 

GAS and ISRE elements are recognized by a class of transcription factors called 
Signal Transducers and Activators of Transcription, or "STATs." There are six 
members of the STATs family. Statl and Stat3 are present in many cell types, as is 

10 Stat2 (as response to IFN-alpha is widespread). Stat4 is more restricted and is not in 
many cell types though it has been found in T helper class I, cells after treatment with 
IL-12. Stat5 was originally called mammary growth factor, but has been found at 
higher concentrations in other cells including myeloid cells. It can be activated in tissue 
culture cells by many cytokines. 

1 5 The STATs are activated to translocate from the cytoplasm to the nucleus upon 

tyrosine phosphorylation by a set of kinases known as the Janus Kinase ("Jaks") 
family. Jaks represent a distinct family of soluble tyrosine kinases and include Tyk2, 
Jakl, Jak2, and Jak3. These kinases display significant sequence similarity and are 
generally catalytically inactive in resting cells. 

20 The Jaks are activated by a wide range of receptors summarized in the Table 

below. (Adapted from review by Schidler and Darnell, Ann. Rev. Biochem. 64:621-5 1 
(1995).) A cytokine receptor family, capable of activating Jaks, is divided into two 
groups: (a) Class 1 includes receptors for IL-2, IL-3, IL-4, IL-6, IL-7, IL-9, DL-1 1, IL- 
12, IL-15, Epo, PRL, GH, G-CSF, GM-CSF, LIF, CNTF, and thrombopoietin; and 

25 (b) Class 2 includes IFN-a, IFN-g, and IL-10. The Class 1 receptors share a 

conserved cysteine motif (a set of four conserved cysteines and one tryptophan) and a 
WSXWS motif (a membrane proxial region encoding Trp-Ser-Xxx-Trp-Ser (SEQ ID 
NO:2)). 

Thus, on binding of a ligand to a receptor, Jaks are activated, which in turn 
30 activate STATs, which then translocate and bind to GAS elements. This entire process 
is encompassed in the Jaks-STATs signal transduction pathway. 

Therefore, activation of the Jaks-STATs pathway, reflected by the binding of 
the GAS or the ISRE element, can be used to indicate proteins involved in the 
proliferation and differentiation of cells. For example, growth factors and cytokines are 
35 known to activate the Jaks-STATs pathway. (See Table below.) Thus, by using GAS 
elements linked to reporter molecules, activators of the Jaks-STATs pathway can be 
identified. 



WO 98/39448 



PCT/US98/04493 



219 



10 



15 



20 



25 



30 



35 



40 



45 





TAT/,. 






C""P A TPC 

MATS 


OAot elements) or 


li>Kb 












Ligand tvk2 


Jakl 


Ml 


Jak3 






iri\ iamiJV 












lrN-a/b + 


+ 






1,2,3 


ISRE 


lhlN-g 




+ 




1 




(IKr l>Lyso>lrr ) 












Tl 1 A . 

11-10 + 




f 


* 


1,3 




gpl30 familv 












1L-6 (Pleiotrohic) + 


+ 


+ 


7 


1,3 


GAS 


(IRFl>Lys6>IFP) 












II- 11 (Pleiotrohic) ? 


+ 


? 


7 


1,3 




OnM(Pleiotrohic) ? 


+ 


+ 


9 


1,3 




Llr(Pleiotrohic) ? 


+ 


+ 


F 


1,3 




CNTF(Pleiotrohic) -/+ 


+ 


+ 


9 


1,3 




G-CSF(Pleiotrohic) ? 


+ 


7 


9 


1,3 




IL-12(Pleiotrohic) + 




+ 


+ 


1,3 




g-C familv 












IL-2 (lymphocytes) 


+ 




+ 


1,3,5 


GAS 


IL-4 (lymph/myeloid) - 




- 


+ 


6 


GAS (IRF1 =IFP 


»Ly6)(IgH) 












IL-7 (lymphocytes) 


+ 


- 


+ 


5 


GAS 


IL-9 (lymphocytes) 


+ 


~ 


+ 


5 


GAS 


IL- 1 3 (lymphocyte) 


+ 


7 


7 


6 


GAS 


IL-15 ? 


+ 


7 


+ 


5 


GAS 


ep 140 familv 












IL-3 (myeloid) 


- 


+ 


- 


5 


GAS 


(IRFl>IFP»Ly6) 












EL-5 (myeloid) 




+ 




5 


GAS 


GM-CSF (myeloid) - 




+ 




5 


GAS 


Growth hormone familv 












GH ? 




+ 




5 




PRL ? 


+/- 


+ 


- 


1,3,5 




EPO ? 


- 


+ 


- 


5 


GAS(B- 


tAo>lKr 1 — lrr»Lyo y ) 












Receptor Tvrosine Kinases 












EGF ? 


+ 


+ 




1.3 


GAS (IRF1) 


PDGF ? 


+ 


+ 




1,3 




CSF-1 ? 


+ 


+ 




1,3 


GAS (not IRF1) 



WO 98/39448 



220 



PCT/US98/04493 



To construct a synthetic GAS containing promoter element, which is used in the 
Biological Assays described in Examples 13-14, a PCR based strategy is employed to 
generate a GAS-SV40 promoter sequence. The 5' primer contains four tandem copies 
of the GAS binding site found in the IRF1 promoter and previously demonstrated to 
5 bind STATs upon induction with a range of cytokines (Rothman et al., Immunity 

1 :457-468 (1994).), although other GAS or IS RE elements can be used instead. The 5* 
primer also contains 18bp of sequence complementary to the SV40 early promoter 
sequence and is flanked with an Xhol site. The sequence of the 5' primer is: 
5 ' :GCGCCTCGAGATTTCCCCGAAATCTAGATTTCCCCGAAATGATTTCCCCG 
10 AAATG ATTTCCCCG AAAT ATCTGCC ATCTCAATTAG :3 ' (SEQIDNO:3) 

The downstream primer is complementary to the S V40 promoter and is flanked 
with a Hind in site: 5 ' : GCGGC AAGCITTTTGC AAAGCCT AGGC: 3 * (SEQ ID 
NO:4) 

PCR amplification is performed using the SV40 promoter template present in 
15 the B-gal:promoter plasmid obtained from Clontech. The resulting PCR fragment is 
digested with Xhol/Hind III and subcloned into BLSK2-. (Stratagene.) Sequencing 
with forward and reverse primers confirms that the insert contains the following 
sequence: 

5 T : CTCG AG ATTTCCCCG AAATCT AG ATTTCCCCG AAATG ATTTCCCCG AAATG 
20 ATTTCCCCGAAATATCTCK:CATCTCAATTAGTCAGCAACCATAGTCCCGCCC 
CTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGC 
CCC ATGGCTG ACT AA' I'lTl'l'ITl ATTT ATGC AG AGGCCG AGGCCGCCTCGGC 
CTCTG AGCT ATTCC AG AAGT AGTG AGG AGGC l'l TT H GG AGGCCT AGGCTTT 
TGC AAA AAGCTT : 3 ' (SEQIDNO:5) 

25 

With this GAS promoter element linked to the SV40 promoter, a GAS:SEAP2 
reporter construct is next engineered. Here, the reporter molecule is a secreted alkaline 
phosphatase, or "SEAP." Clearly, however, any reporter molecule can be instead of 
SEAP, in this or in any of the other Examples. Well known reporter molecules that can 

30 be used instead of SEAP include chloramphenicol acetyl transferase (CAT), luciferase, 
alkaline phosphatase, B-galactosidase, green fluorescent protein (GFP), or any protein 
detectable by an antibody. 

The above sequence confirmed synthetic GAS-SV40 promoter element is 
subcloned into the pSEAP-Promoter vector obtained from Clontech using Hindlll and 

35 Xhol, effectively replacing the SV40 promoter with the amplified GAS:SV40 promoter 
clement, to create the GAS-SEAP vector. However, this vector does not contain a 
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neomycin resistance gene, and therefore, is not preferred for mammalian expression 
systems. 

Thus, in order to generate mammalian stable cell lines expressing the GAS- 
SEAP reporter, the GAS-SEAP cassette is removed from the GAS-SEAP vector using 

5 Sail and NotI, and inserted into a backbone vector containing the neomycin resistance 
gene, such as pGFP-1 (Clontech), using these restriction sites in the multiple cloning 
site, to create the GAS-SEAP/Neo vector. Once this vector is transfected into 
mammalian cells, this vector can then be used as a reporter molecule for GAS binding 
as described in Examples 13-14. 

0 Other constructs can be made using the above description and replacing GAS 

with a different promoter sequence. For example, construction of reporter molecules 
containing NFK-B and EGR promoter sequences are described in Examples 15 and 16. 
However, many other promoters can be substituted using the protocols described in 
these Examples. For instance, SRE, IL-2, NFAT, or Osteocalcin promoters can be 

5 substituted, alone or in combination (e.g., G AS/NF- KB/EGR, GAS/NF-KB, II- 
2/NFAT, or NF-KB/GAS). Similarly, other cell lines can be used to test reporter 
construct activity, such as HELA (epithelial), HUVEC (endothelial), Reh (B-cell), 
Saos-2 (osteoblast), HUVAC (aortic), or Cardiomyocyte. 

3 Example 13: High-Throughput Screening Assay for T-cell Activity. 

The following protocol is used to assess T-cell activity by identifying factors, 
such as growth factors and cytokines, that may proliferate or differentiate T-cells. T- 
cell activity is assessed using the GAS/SEAP/Neo construct produced in Example 12. 
Thus, factors that increase SEAP activity indicate the ability to activate the Jaks-STATS 

5 signal transduction pathway. The T-cell used in this assay is Jurkat T-cells (ATCC 
Accession No. TIB- 152), although Molt-3 cells (ATCC Accession No. CRL-1552) and 
Molt-4 cells (ATCC Accession No. CRL-1582) cells can also be used. 

Jurkat T-cells are lymphoblastic CD4+ Thl helper cells. In order to generate 
stable cell lines, approximately 2 million Jurkat cells are transfected with the GAS- 

3 SEAP/neo vector using DMRIE-C (Life Technologies)(transfection procedure 
described below). The transfected cells are seeded to a density of approximately 
20,000 cells per well and transfectants resistant to 1 mg/ml genticin selected. Resistant 
colonies are expanded and then tested for their response to increasing concentrations of 
interferon gamma. The dose response of a selected clone is demonstrated. 

5 Specifically, the following protocol will yield sufficient cells for 75 wells 

containing 200 ul of cells. Thus, it is either scaled up, or performed in multiple to 
generate sufficient cells for multiple 96 well plates. Jurkat cells are maintained in RPMI 
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+ 10% serum with l%Pen-Strep. Combine 2.5 mis of OPTI-MEM (Life Technologies) 
with 10 ug of plasmid DNA in a T25 flask. Add 2.5 ml OPTI-MEM containing 50 ul 
of DMRIE-C and incubate at room temperature for 15-45 mins. 

During the incubation period, count cell concentration, spin down the required 
5 number of cells ( 10 7 per transfection), and resuspend in OPTI-MEM to a final 

concentration of 10 7 cells/ml. Then add 1ml of 1 x 10 7 cells in OPTI-MEM to T25 flask 
and incubate at 37°C for 6 hrs. After the incubation, add 10 ml of RPMI + 15% serum. 

The Jurkat:GAS-SEAP stable reporter lines are maintained in RPMI + 10% 
serum, 1 mg/ml Genticin, and 1% Pen-Strep. These cells are treated with supematants 
10 containing a polypeptide as produced by the protocol described in Example 1 1 . 

On the day of treatment with the supernatant, the cells should be washed and 
resuspended in fresh RPMI + 10% serum to a density of 500,000 cells per ml. The 
exact number of cells required will depend on the number of supematants being 
screened. For one 96 well plate, approximately 10 million cells (for 10 plates, 100 
1 5 million cells) are required. 

Transfer the cells to a triangular reservoir boat, in order to dispense the cells into 
a 96 well dish, using a 12 channel pipette. Using a 12 channel pipette, transfer 200 ul 
of cells into each well (therefore adding 100, 000 cells per well). 

After all the plates have been seeded, 50 ul of the supematants are transferred 
20 directly from the 96 well plate containing the supematants into each well using a 12 
channel pipette. In addition, a dose of exogenous interferon gamma (0.1, 1.0, 10 ng) 
is added to wells H9, H 10, and HI 1 to serve as additional positive controls for the 
assay. 

The 96 well dishes containing Jurkat cells treated with supematants are placed in 
25 an incubator for 48 hrs (note: this time is variable between 48-72 hrs). 35 ul samples 
from each well are then transferred to an opaque 96 well plate using a 12 channel 
pipette. The opaque plates should be covered (using sellophene covers) and stored at - 
20°C until SEAP assays are performed according to Example 17. The plates 

containing the remaining treated cells are placed at 4°C and serve as a source of material 
30 for repeating the assay on a specific well if desired. 

As a positive control, 100 Unit/ml interferon gamma can be used which is 
lenown to activate Jurkat T cells. Over 30 fold induction is typically observed in the 
positive control wells. 
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Example 14: High-Throughput Screening Assay Identifying Myeloid 
Activity 

The following protocol is used to assess myeloid activity by identifying factors, 
such as growth factors and cytokines, that may proliferate or differentiate myeloid cells. 
5 Myeloid cell activity is assessed using the GAS/SEAP/Neo construct produced in 

Example 12. Thus, factors that increase SEAP activity indicate the ability to activate the 
Jaks-STATS signal transduction pathway. The myeloid cell used in this assay is U937, 
a pre-monocyte cell line, although TF-1, HL60, or KG1 can be used. 

To transiently transfect U937 cells with the GAS/SEAP/Neo construct produced 
10 in Example 12, a DEAE-Dextran method (Kharbanda et. al., 1994, Cell Growth & 

Differentiation, 5:259-265) is used. First, harvest 2xl0e? U937 cells and wash with 
PBS. The U937 cells are usually grown in RPMI 1640 medium containing 10% heat- 
inactivated fetal bovine serum (FBS) supplemented with 100 units/ml penicillin and 100 
mg/ml streptomycin. 

15 Next, suspend the cells in 1 ml of 20 mM Tris-HCl (pH 7.4) buffer containing 

0.5 mg/ml DEAE-Dextran, 8 ug GAS-SEAP2 plasmid DNA, 140 mM NaCl, 5 mM 
KC1, 375 uM Na 2 HP0 4 .7H 2 0, 1 mM MgCl 2 , and 675 uM CaCl 2 . Incubate at 37°C 
for 45 min. 

Wash the cells with RPMI 1640 medium containing 10% FBS and then 
20 resuspend in 10 ml complete medium and incubate at 37°C for 36 hr. 

The GAS-SEAP/U937 stable cells are obtained by growing the cells in 400 
ug/ml G418. The G418-free medium is used for routine growth but every one to two 
months, the cells should be re-grown in 400 ug/ml G418 for couple of passages. 

These cells are tested by harvesting IxlO 8 cells (this is enough for ten 96-well 
25 plates assay) and wash with PBS. Suspend the cells in 200 ml above described growth 
medium, with a final density of 5xl0 5 cells/ml. Plate 200 ul cells per well in the 96- 
well plate (or IxlO 5 cells/well). 

Add 50 ul of the supernatant prepared by the protocol described in Example 1 1 . 

Incubate at 37°C for 48 to 72 hr. As a positive control, 100 Unit/ml interferon gamma 
30 can be used which is known to activate U937 cells. Over 30 fold induction is typically 
observed in the positive control wells. SEAP assay the supernatant according to the 
protocol described in Example 17. 
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Example 15: High-Throughput Screening Assay Identifying Neuronal 
Activity. 

When cells undergo differentiation and proliferation, a group of genes are 
activated through many different signal transduction pathways. One of these genes, 

5 EGR1 (early growth response gene 1), is induced in various tissues and cell types upon 
activation. The promoter of EGR1 is responsible for such induction. Using the EGR1 
promoter linked to reporter molecules, activation of cells can be assessed. 

Particularly, the following protocol is used to assess neuronal activity in PC 12 
cell lines. PC 12 cells (rat phenochromocytoma cells) are known to proliferate and/or 

0 differentiate by activation with a number of mitogens, such as TPA (tetradecanoyl 

phorbol acetate), NGF (nerve growth factor), and EGF (epidermal growth factor). The 
EGR1 gene expression is activated during this treatment. Thus, by stably transfecting 
PC 12 cells with a construct containing an EGR promoter linked to SEAP reporter, 
activation of PC 12 cells can be assessed. 

5 The EGR/SEAP reporter construct can be assembled by the following protocol. 

The EGR-1 promoter sequence (-633 to +l)(Sakamoto K et al., Oncogene 6:867-871 
(1991)) can be PCR amplified from human genomic DNA using the following primers: 

5' GCGCTCGAGGGATGACAGCGATAGAACCCCGG -3' (SEQ ID NO:6) 
0 5' GCGAAGCTTCGCGACTCCCCGGATCCGCCTC-3' (SEQ ID NO:7) 

Using the GAS:SEAP/Neo vector produced in Example 12, EGR1 amplified 
product can then be inserted into this vector. Linearize the GAS:SEAP/Neo vector 
using restriction enzymes Xhol/Hindlll, removing the GAS/SV40 stuffer. Restrict the 
5 EGR1 amplified product with these same enzymes. Ligate the vector and the EGR1 
promoter. 

To prepare 96 well-plates for cell culture, two mis of a coating solution ( 1 :30 
dilution of collagen type I (Upstate Biotech Inc. Cat#08-1 15) in 30% ethanol (filter 
sterilized)) is added per one 10 cm plate or 50 ml per well of the 96- well plate, and 

0 allowed to air dry for 2 hr. 

PC 12 cells are routinely grown in RPMI-1640 medium (Bio Whittaker) 
containing 10% horse serum (JRH BIOSCIENCES, Cat. # 12449-78P), 5% heat- 
inactivated fetal bovine serum (FBS) supplemented with 100 units/ml penicillin and 100 
ug/ml streptomycin on a precoated 10 cm tissue culture dish. One to four split is done 

5 every three to four days. Cells are removed from the plates by scraping and 
resuspended with pipetting up and down for more than 15 times. 
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Transfect the EGR/SEAP/Neo construct into PC12 using the Lipofectamine 
protocol described in Example 11. EGR-SEAP/PC1 2 stable cells are obtained by 
growing the cells in 300 ug/ml G418. The G418-free medium is used for routine 
growth but every one to two months, the cells should be re-grown in 300 ug/ml G418 

for couple of passages. 

To assay for neuronal activity, a 10 cm plate with cells around 70 to 80% 
confluent is screened by removing the old medium. Wash the cells once with PBS 
(Phosphate buffered saline). Then starve the cells in low serum medium (RPMI-1640 
containing 1% horse serum and 0.5% FBS with antibiotics) overnight. 

The next morning, remove the medium and wash the cells with PBS. Scrape 
off the cells from the plate, suspend the cells well in 2 ml low serum medium. Count 
the cell number and add more low serum medium to reach final cell density as 5x105 

cells/ml. < . 

Add 200 ul of the cell suspension to each well of 96-well plate (equivalent to 

1x105 C ells/well). Add 50 ul supernatant produced by Example 1 1. 37°C for 48 to 72 
hr As a positive control, a growth factor known to activate PC12 cells through EGR 
can be used, such as 50 ng/ul of Neuronal Growth Factor (NGF). Over fifty-fold 
induction of SEAP is typically seen in the positive control wells. SEAP assay the 
supernatant according to Example 17. 

«sr.m pl« 16: p { f H.Thro„Phnut Screening Assay for T-c.ell Activity , 

NF-kB (Nuclear Factor KB) is a transcription factor activated by a wide variety 
of agents including the inflammatory cytokines IL-1 and TNF, CD30 and CD40, 
lymphotoxin-alpha and lymphotoxin-beta, by exposure to LPS or thrombin, and by 
expression of certain viral gene products. As a transcription factor, NF-kB regulates 
the expression of genes involved in immune cell activation, control of apoptosis (NF- 
kB appears to shield cells from apoptosis), B and T-cell development, anti-viral and 
antimicrobial responses, and multiple stress responses. 

In non-stimulated conditions, NF- kB is retained in the cytoplasm with I-kB 
, (Inhibitor KB). However, upon stimulation, I- kB is phosphorylated and degraded, 
causing NF- KB to shuttle to the nucleus, thereby activating transcription of target 
genes. Target genes activated by NF- kB include IL-2, IL-6, GM-CSF, ICAM-1 and 
class 1 MHC. 
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Due to its central role and ability to respond to a range of stimuli, reporter 
constructs utilizing the NF-kB promoter element are used to screen the supernatants 
produced in Example 11. Activators or inhibitors of NF-kB would be useful in treating 
diseases. For example, inhibitors of NF-kB could be used to treat those diseases 
5 related to the acute or chronic activation of NF-kB, such as rheumatoid arthritis. 

To construct a vector containing the NF-kB promoter element, a PCR based 

strategy is employed. The upstream primer contains four tandem copies of the NF-kB 

binding site (GGGGACTTTCCC) (SEQ ID NO:8), 18 bp of sequence complementary 
to the 5' end of the S V40 early promoter sequence, and is flanked with an Xhol site: 
1 0 5:GCGGCCTCGAGGGGACTTTCCCGGGGACTTTCCGGGGACTTTCCGGGA^ 
TTTCCATCCTGCCATCTCAATTAG:3 ' (SEQ ID NO:9) 

The downstream primer is complementary to the 3* end of the SV40 promoter 
and is flanked with a Hind III site: 

5 ' :GCGGC A AGCTTTTTGC AAAGCCTAGGC: 3 ' (SEQ ID NO:4) 
15 PCR amplification is performed using the SV40 promoter template present in 

the pB-gal:promoter plasmid obtained from Clontech. The resulting PCR fragment is 
digested with Xhol and Hind III and subcloned into BLSK2-. (Stratagene) 
Sequencing with the T7 and T3 primers confirms the insert contains the following 
sequence: 

20 5 ' : CTCG AGGGG ACTTTCCCGGGG ACTTTCCGGGG ACTTTCCGGG ACTTTCC 
ATCTGCCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCA 
TCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACT 

aat rrrrrri attt atgc ag aggccg aggccgcctcggcctctg agct attc 

CAGAAGTAGTGAGGAGGCnTTTTTGGAGGCCT 
25 3' (SEQ ID NO: 10) 

Next, replace the SV40 minimal promoter element present in the pSEAP2- 

promoter plasmid (Clontech) with this NF-kB/S V40 fragment using Xhol and Hindlll. 

However, this vector does not contain a neomycin resistance gene, and therefore, is not 
30 preferred for mammalian expression systems. 

In order to generate stable mammalian cell lines, the NF-kB/SV40/SEAP 

cassette is removed from the above NF-kB/SEAP vector using restriction enzymes Sail 
and NotI, and inserted into a vector containing neomycin resistance. Particularly, the 
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NF-KB/SV40/SEAP cassette was inserted into pGFP-1 (Clontech), replacing the GFP 
gene, after restricting pGFP-1 with Sail and Notl. 

Once NF-KB/SV40/SEAP/Neo vector is created, stable Jurkat T-cells are 
created and maintained according to the protocol described in Example 13. Similarly, 
5 the method for assaying supernatants with these stable Jurkat T-cells is also described 
in Example 13. As a positive control, exogenous TNF alpha (0.1,1, 10 ng) is added to 
wells H9, H 10, and HI 1, with a 5-10 fold activation typically observed. 

Example 17: A ssay for SEAP Activity 

10 As a reporter molecule for the assays described in Examples 13-16, SEAP 

activity is assayed using the Tropix Phospho-light Kit (Cat. BP-400) according to the 
following general procedure. The Tropix Phospho-light Kit supplies the Dilution, 
Assay, and Reaction Buffers used below. 

Prime a dispenser with the 2.5x Dilution Buffer and dispense 15 ul of 2.5x 

1 5 dilution buffer into Optiplates containing 35 \il of a supernatant. Seal the plates with a 

plastic sealer and incubate at 65°C for 30 min. Separate the Optiplates to avoid uneven 
heating. 

Cool the samples to room temperature for 15 minutes. Empty the dispenser and 
prime with the Assay Buffer. Add 50 ul Assay Buffer and incubate at room 

20 temperature 5 min. Empty the dispenser and prime with the Reaction Buffer (see the 
table below). Add 50 Reaction Buffer and incubate at room temperature for 20 
minutes. Since the intensity of the chemiluminescent signal is time dependent, and it 
takes about 10 minutes to read 5 plates on luminometcr, one should treat 5 plates at each 
time and start the second set 10 minutes later. 

25 Read the relative light unit in the luminometer. Set H 12 as blank, and print the 

results. An increase in chemiluminescence indicates reporter activity. 

Reaction Buffer Formulation: 

# of plates Rxn buffer diluent (ml) CSPD (ml) 

~K) 60 " 3 

1 " 65 3.25 

12 70 3.5 

'3 75 3.75 

14 80 4 



WO 98/39448 



PCT/US98/04493 



228 



15 


85 


4 9^ 


16 


90 


4 ^ 


17 


95 


4 7S 


18 


100 


c 


19 


105 




20 


1 10 




21 


1 15 


D. / J 


22 




6 


23 


125 


/: 


24 


130 


O.J 


25 


135 


O. Id 


26 


1 40 


1 


27 


1 4S 


7.25 


28 


1 SO 


7.5 


29 


I JJ 


7.75 


^0 
~>\j 


I ou 


8 


31 


1 UJ 


8.25 


32 


170 


8.5 


33 


175 


o. /5 


34 


1 ou 


9 


35 


185 




36 


190 




37 


195 


y. ij 


38 


700 


10 


39 


205 




40 


210 


1 U.J 


41 


215 


1 u. /o 


42 


220 




43 


225 


1 1 9^ 


44 


230 


1 1 S 

i i .J 


45 


235 




HO 




12 


47 


245 


12.25 


48 


250 


12.5 


49 


255 


12.75 


50 


260 


13 



WO 98/39448 



229 



PCI7US98/04493 



Example 18: High-Throughput Screening Assay Identifying Changes in 
Small Molecule Concentration and Membrane Permeability 

Binding of a ligand to a receptor is known to alter intracellular levels of small 

5 molecules, such as calcium, potassium, sodium, and pH, as well as alter membrane 
potential. These alterations can be measured in an assay to identify supernatants which 
bind to receptors of a particular cell. Although the following protocol describes an 
assay for calcium, this protocol can easily be modified to detect changes in potassium, 
sodium, pH, membrane potential, or any other small molecule which is detectable by a 

0 fluorescent probe. 

The following assay uses Fluorometric Imaging Plate Reader ("FLIPR") to 
measure changes in fluorescent molecules (Molecular Probes) that bind small 
molecules. Clearly, any fluorescent molecule detecting a small molecule can be used 
instead of the calcium fluorescent molecule, fluo-3, used here. 

5 For adherent cells, seed the cells at 10,000 -20,000 cells/well in a Co-star black 

96-well plate with clear bottom. The plate is incubated in a C0 2 incubator for 20 hours. 
The adherent cells are washed two times in Biotek washer with 200 ul of HBSS 
(Hank's Balanced Salt Solution) leaving 100 ul of buffer after the final wash. 

A stock solution of 1 mg/ml fluo-3 is made in 10% pluronic acid DMSO. To 

3 load the cells with fluo-3, 50 ul of 12 ug/ml fluo-3 is added to each well. The plate is 
incubated at 37°C in a C0 2 incubator for 60 min. The plate is washed four times in the 
Biotek washer with HBSS leaving 100 ul of buffer. 

For non-adherent cells, the cells are spun down from culture media. Cells are 
re-suspended to 2-5x1 0 6 cells/ml with HBSS in a 50-ml conical tube. 4 ul of 1 mg/ml 

5 fluo-3 solution in 10% pluronic acid DMSO is added to each ml of cell suspension. 
The tube is then placed in a 3TC water bath for 30-60 min. The cells are washed twice 
with HBSS, resuspended to lxlO 6 cells/ml, and dispensed into a microplate, 100 
ulAvell. The plate is centrifuged at 1000 rpm for 5 min. The plate is then washed once 
in Denley CellWash with 200 ul, followed by an aspiration step to 100 ul final volume. 

) For a non-cell based assay, each well contains a fluorescent molecule, such as 

fluo-3. The supernatant is added to the well, and a change in fluorescence is detected. 

To measure the fluorescence of intracellular calcium, the FLIPR is set for the 
following parameters: (1) System gain is 300-800 mW; (2) Exposure time is 0.4 
second; (3) Camera F/stop is F/2; (4) Excitation is 488 nm; (5) Emission is 530 nm; and 

5 (6) Sample addition is 50 ul. Increased emission at 530 nm indicates an extracellular 
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signaling even which has resulted in an increase in the intracellular Ca++ 
concentration. 

Example 19: High-Throughput Screening Assay Identifying Tyrosine 
5 Kinase Activity 

The Protein Tyrosine Kinases (PTK) represent a diverse group of 
transmembrane and cytoplasmic kinases. Within the Receptor Protein Tyrosine Kinase 
RPTK) group are receptors for a range of mitogenic and metabolic growth factors 
including the PDGF, FGF, EGF, NGF, HGF and Insulin receptor subfamilies. In 
10 addition there are a large family of RPTKs for which the corresponding ligand is 
unknown. Ligands for RPTKs include mainly secreted small proteins, but also 
membrane-bound and extracellular matrix proteins. 

Activation of RPTK by ligands involves ligand-mcdiatcd receptor dimerization, 
resulting in transphosphorylation of the receptor subunits and activation of the 
15 cytoplasmic tyrosine kinases. The cytoplasmic tyrosine kinases include receptor 
associated tyrosine kinases of the src-family (e.g., src, yes, lck, lyn, fyn) and non- 
receptor linked and cytosolic protein tyrosine kinases, such as the Jak family, members 
of which mediate signal transduction triggered by the cytokine superfamily of receptors 
(e.g., the Interleukins, Interferons, GM-CSF, and Leptin). 
20 Because of the wide range of known factors capable of stimulating tyrosine 

kinase activity, the identification of novel human secreted proteins capable of activating 
tyrosine kinase signal transduction pathways are of interest. Therefore, the following 
protocol is designed to identify those novel human secreted proteins capable of 
activating the tyrosine kinase signal transduction pathways. 

Seed target cells (e.g., primary keratinocytes) at a density of approximately 
25,000 cells per well in a 96 well Loprodyne Silent Screen Plates purchased from 
Nalge Nunc (Naperville, IL). The plates are sterilized with two 30 minute rinses with 
100% ethanol, rinsed with water and dried overnight. Some plates are coated for 2 hr 
with 100 ml of cell culture grade type I collagen (50 mg/ml), gelatin (2%) or polylysine 
(50 mg/ml), all of which can be purchased from Sigma Chemicals (St. Louis, MO) or 
10% Matrigel purchased from Becton Dickinson (Bedford,MA), or calf serum, rinsed 
with PBS and stored at 4°C. Cell growth on these plates is assayed by seeding 5,000 
cells/well in growth medium and indirect quantitation of cell number through use of 
alamarBlue as described by the manufacturer Alamar Biosciences, Inc. (Sacramento, 
CA) after 48 hr. Falcon plate covers #3071 from Becton Dickinson (Bedford,MA) are 
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used to cover the Loprodyne Silent Screen Plates. Falcon Microtest III cell culture 
plates can also be used in some proliferation experiments. 

To prepare extracts, A431 cells are seeded onto the nylon membranes of 
Loprodyne plates (20,000/200ml/well) and cultured overnight in complete medium. 
5 Cells are quiesced by incubation in serum-free basal medium for 24 hr. After 5-20 
minutes treatment with EGF (60ng/ml) or 50 ul of the supernatant produced in Example 
1 1, the medium was removed and 100 ml of extraction buffer ((20 mM HEPES pH 
7.5, 0.15 M NaCl, 1% Triton X-100, 0.1% SDS, 2 mM Na3V04, 2 mM Na4P207 
and a cocktail of protease inhibitors (# 1836170) obtained from Boeheringer Mannheim 
10 (Indianapolis, IN) is added to each well and the plate is shaken on a rotating shaker for 
5 minutes at 4°C. The plate is then placed in a vacuum transfer manifold and the extract 
filtered through the 0.45 mm membrane bottoms of each well using house vacuum. 
Extracts are collected in a 96-well catch/assay plate in the bottom of the vacuum 
manifold and immediately placed on ice. To obtain extracts clarified by centrifugation, 
1 5 the content of each well, after detergent solubilization for 5 minutes, is removed and 
centrifuged for 15 minutes at 4°C at 16,000 x g. 

Test the filtered extracts for levels of tyrosine kinase activity. Although many 
methods of detecting tyrosine kinase activity are known, one method is described here. 
Generally, the tyrosine kinase activity of a supernatant is evaluated by 
20 determining its ability to phosphorylate a tyrosine residue on a specific substrate (a 
biotinylated peptide). Biotinylated peptides that can be used for this purpose include 
PSK1 (corresponding to amino acids 6-20 of the cell division kinase cdc2-p34) and 
PSK2 (corresponding to amino acids 1-17 of gastrin). Both peptides are substrates for 
a range of tyrosine kinases and are available from Boehringer Mannheim. 
25 The tyrosine kinase reaction is set up by adding the following components in 

order. First, add lOul of 5uM Biotinylated Peptide, then lOul ATP/Mg2+ (5mM 
ATP/50mM MgCl2), then lOul of 5x Assay Buffer (40rnM imidazole hydrochloride, 
pH7.3, 40 mM beta-glycerophosphate, ImM EGTA, lOOmM MgCl 2 , 5 mM MnCl 2 
0.5 mg/ml BSA), then 5ul of Sodium Vanadate(lmM), and then 5ul of water. Mix the 
30 components gently and preincubate the reaction mix at 30°C for 2 min. Initial the 
reaction by adding lOul of the control enzyme or the filtered supernatant. 

The tyrosine kinase assay reaction is then terminated by adding 10 ul of 120mm 
EDTA and place the reactions on ice. 

Tyrosine kinase activity is determined by transferring 50 ul aliquot of reaction 
35 mixture to a microliter plate (MTP) module and incubating at 37°C for 20 min. This 
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allows the streptavadin coated 96 well plate to associate with the biotinylated peptide. 
Wash the MTP module with 300ul/well of PBS four times. Next add 75 ul of anti- 
phospotyrosine antibody conjugated to horse radish peroxidase(anti-P-Tyr- 

POD(0.5u/ml)) to each well and incubate at 37°C for one hour. Wash the well as 
5 above. 

Next add lOOul of peroxidase substrate solution (Boehringer Mannheim) and 
incubate at room temperature for at least 5 mins (up to 30 min). Measure the 
absorbance of the sample at 405 nm by using ELISA reader. The level of bound 
peroxidase activity is quantitated using an ELISA reader and reflects the level of 
10 tyrosine kinase activity. 

Example 20: High-Throughput Screening Assay Identifying 
Phosphorylation Activity 

As a potential alternative and/or compliment to the assay of protein tyrosine 
15 kinase activity described in Example 19, an assay which detects activation 

(phosphorylation) of major intracellular signal transduction intermediates can also be 
used. For example, as described below one particular assay can detect tyrosine 
phosphorylation of the Erk- 1 and Erk-2 kinases. However, phosphorylation of other 
molecules, such as Raf, JNK, p38 MAP, Map kinase kinase (MEK), MEK kinase, 
20 Src, Muscle specific kinase (MuSK), IRAK, Tec, and Janus, as well as any other 
phosphoserine, phosphotyrosine, or phosphothreonine molecule, can be detected by 
substituting these molecules for Erk-1 or Erk-2 in the following assay. 

Specifically, assay plates are made by coating the wells of a 96-well ELISA 
plate with 0.1ml of protein G (lug/ml) for 2 hr at room temp, (RT). The plates are then 
25 rinsed with PBS and blocked with 3% BSA/PBS for 1 hr at RT. The protein G plates 
are then treated with 2 commercial monoclonal antibodies ( lOOng/well) against Erk-1 
and Erk-2 (1 hr at RT) (Santa Cruz Biotechnology). (To detect other molecules, this 
step can easily be modified by substituting a monoclonal antibody detecting any of the 

above described molecules.) After 3-5 rinses with PBS, the plates are stored at 4°C 
30 until use. 

A43 1 cells are seeded at 20,000/well in a 96-well Loprodyne filterplate and 
cultured overnight in growth medium. The cells are then starved for 48 hr in basal 
medium (DMEM) and then treated with EGF (6ng/well) or 50 ul of the supernatants 
obtained in Example 1 1 for 5-20 minutes. The cells are then solubilized and extracts 
35 filtered directly into the assay plate. 
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After incubation with the extract for 1 hr at RT, the wells are again rinsed. As a 
positive control, a commercial preparation of MAP kinase (lOng/well) is used in place 
of A43 1 extract. Plates are then treated with a commercial polyclonal (rabbit) antibody 
(lug/ml) which specifically recognizes the phosphorylated epitope of the Erk-1 and 
5 Erk-2 kinases ( 1 hratRT). This antibody is biotinylated by standard procedures. The 
bound polyclonal antibody is then quantitated by successive incubations with 
Europium-streptavidin and Europium fluorescence enhancing reagent in the Wallac 
DELFIA instrument (time-resolved fluorescence). An increased fluorescent signal over 
background indicates a phosphorylation. 

10 

Example 21: Method of Determining Alterations in a Gene 
Corresponding to a Polynucleotide 

RNA isolated from entire families or individual patients presenting with a 
phenotype of interest (such as a disease) is be isolated. cDNA is then generated from 
15 these RNA samples using protocols known in the art. (See, Sambrook.) The cDNA is 
then used as a template for PCR, employing primers surrounding regions of interest in 

SEQ ID NO:X. Suggested PCR conditions consist of 35 cycles at 95°C for 30 

seconds; 60-120 seconds at 52-58°C; and 60-120 seconds at 70°C, using buffer 

solutions described in Sidransky, D., et aL, Science 252:706 (1991). 

20 PCR products is then sequenced using primers labeled at their 5' end with T4 

polynucleotide kinase, employing SequiTherm Polymerase. (Epicentre Technologies). 
The intron-exon borders of selected exons is also determined and genomic PCR 
products analyzed to confirm the results. PCR products harboring suspected mutations 
is then cloned and sequenced to validate the results of the direct sequencing. 

25 PCR products is cloned into T-tailed vectors as described in Holton, T. A. and 

Graham, M.W., Nucleic Acids Research, 19:1 156 (1991) and sequenced with T7 
polymerase (United States Biochemical). Affected individuals is identified by 
mutations not present in unaffected individuals. 

Genomic rearrangements are also observed as a method of determining 

30 alterations in a gene corresponding to a polynucleotide. Genomic clones isolated 
according to Example 2 are nick-translated with digoxigenindeoxy-uridine 5'- 
triphosphate (Boehringer Manheim), and FISH performed as described in Johnson, 
Cg. et al., Methods Cell Biol. 35:73-99 (1991). Hybridization with the labeled probe is 
carried out using a vast excess of human cot-1 DNA for specific hybridization to the 

35 corresponding genomic locus. 
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Chromosomes are counterstained with 4,6-diamino-2-phenylidole and 
propidium iodide, producing a combination of C- and R-bands. Aligned images for 
precise mapping are obtained using a triple-band filter set (Chroma Technology, 
Brattleboro, VT) in combination with a cooled charge -coupled device camera 

5 (Photometries, Tucson, AZ) and variable excitation wavelength filters. (Johnson, Cv. 
et al M Genet. Anal. Tech. Appl., 8:75 (1991).) Image collection, analysis and 
chromosomal fractional length measurements are performed using the ISee Graphical 
Program System. (Inovision Corporation, Durham, NC.) Chromosome alterations of 
the genomic region hybridized by the probe are identified as insertions, deletions, and 

0 translocations. These alterations are used as a diagnostic marker for an associated 
disease. 

Example 22; Method of Detecting Abnormal Levels of a Polypeptide in a 
Biological Sample 

5 A polypeptide of the present invention can be detected in a biological sample, 

and if an increased or decreased level of the polypeptide is detected, this polypeptide is 
a marker for a particular phenotype. Methods of detection are numerous, and thus, it is 
understood that one skilled in the art can modify the following assay to fit their 
particular needs. 

0 For example, antibody-sandwich ELISAs are used to detect soluble 

polypeptides in a sample, preferably a biological sample. Wells of a microliter plate are 
coated with specific antibodies, at a final concentration of 0.2 to 10 ug/ml. The 
antibodies are either monoclonal or polyclonal and are produced by the method 
described in Example 10. The wells are blocked so that non-specific binding of the 

5 polypeptide to the well is reduced. 

The coated wells are then incubated for > 2 hours at RT with a sample 
containing the polypeptide. Preferably, serial dilutions of the sample should be used to 
validate results. The plates are then washed three times with deionized or distilled water 
to remove unbounded polypeptide. 

3 Next, 50 ul of specific antibody-alkaline phosphatase conjugate, at a 

concentration of 25-400 ng, is added and incubated for 2 hours at room temperature. 
The plates are again washed three times with deionized or distilled water to remove 
unbounded conjugate. 

Add 75 ul of 4-methylumbelliferyl phosphate (MUP) or p-nitrophenyl 

5 phosphate (NPP) substrate solution to each well and incubate 1 hour at room 

temperature. Measure the reaction by a microliter plate reader. Prepare a standard 
curve, using serial dilutions of a control sample, and plot polypeptide concentration on 
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the X-axis (log scale) and fluorescence or absorbance of the Y-axis (linear scale). 
Interpolate the concentration of the polypeptide in the sample using the standard curve. 

Example 23: Formulating a Polypeptide 

5 The secreted polypeptide composition will be formulated and dosed in a fashion 

consistent with good medical practice, taking into account the clinical condition of the 
individual patient (especially the side effects of treatment with the secreted polypeptide 
alone), the site of delivery, the method of administration, the scheduling of 
administration, and other factors known to practitioners. The "effective amount" for 

0 purposes herein is thus determined by such considerations. 

As a general proposition, the total pharmaceutically effective amount of secreted 
polypeptide administered parenterally per dose will be in the range of about 1 p,g/kg/day 
to 10 mg/kg/day of patient body weight, although, as noted above, this will be subject 
to therapeutic discretion. More preferably, this dose is at least 0.01 mg/kg/day, and 

5 most preferably for humans between about 0.01 and 1 mg/kg/day for the hormone. If 
given continuously, the secreted polypeptide is typically administered at a dose rate of 
about 1 (lg/kg/hour to about 50 |ig/kg/hour, either by 1-4 injections per day or by 
continuous subcutaneous infusions, for example, using a mini-pump. An intravenous 
bag solution may also be employed. The length of treatment needed to observe changes 

0 and the interval following treatment for responses to occur appears to vary depending 
on the desired effect. 

Pharmaceutical compositions containing the secreted protein of the invention are 
administered orally, rectally, parenterally, intracistemally, intravaginally, 
intraperitoneally, topically (as by powders, ointments, gels, drops or transdermal 

5 patch), bucally, or as an oral or nasal spray. "Pharmaceutically acceptable carrier" refers 
to a non-toxic solid, semisolid or liquid filler, diluent, encapsulating material or 
formulation auxiliary of any type. The term "parenteral" as used herein refers to modes 
of administration which include intravenous, intramuscular, intraperitoneal, intrasternal, 
subcutaneous and intraarticular injection and infusion. 

0 The secreted polypeptide is also suitably administered by sustained-release 

systems. Suitable examples of sustained-release compositions include semi-permeable 
polymer matrices in the form of shaped articles, e.g., films, or mirocapsules. 
Sustained-release matrices include polylactides (U.S. Pat. No. 3,773,919, EP 58,481), 
copolymers of L-glutamic acid and gamma-ethyl-L-glutamate (Sidman, U. et ah, 

5 Biopolymers 22:547-556 (1983)), poly (2- hydroxyethyl methacrylate) (R. Langer et 
al., J. Biomed. Mater. Res. 15:167-277 ( 1981), and R. Langer, Chem. Tech. 12:98- 
105 (1982)), ethylene vinyl acetate (R. Langer et al.) or poly-D- (-)-3-hydroxybutyric 
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acid (EP 133,988). Sustained-release compositions also include liposomally entrapped 
polypeptides. Liposomes containing the secreted polypeptide are prepared by methods 
known per se: DE 3,218,121 ; Epstein et al., Proc. Natl. Acad. Sci. USA 82:3688-3692 
(1985); Hwang et al., Proc. Natl. Acad. Sci. USA 77:4030-4034 (1980); EP 52,322; 

5 EP 36,676; EP 88,046; EP 143,949; EP 142,641; Japanese Pat. Appl. 83-1 18008; 
U.S. Pat. Nos. 4,485,045 and 4,544,545; and EP 102,324. Ordinarily, the liposomes 
are of the small (about 200-800 Angstroms) unilamellar type in which the lipid content 
is greater than about 30 mol. percent cholesterol, the selected proportion being adjusted 
for the optimal secreted polypeptide therapy. 

0 For parenteral administration, in one embodiment, the secreted polypeptide is 

formulated generally by mixing it at the desired degree of purity, in a unit dosage 
injectable form (solution, suspension, or emulsion), with a pharmaceutical^ acceptable 
carrier, i.e., one that is non-toxic to recipients at the dosages and concentrations 
employed and is compatible with other ingredients of the formulation. For example, the 

5 formulation preferably does not include oxidizing agents and other compounds that are 
known to be deleterious to polypeptides. 

Generally, the formulations are prepared by contacting the polypeptide 
uniformly and intimately with liquid carriers or finely divided solid carriers or both. 
Then, if necessary, the product is shaped into the desired formulation. Preferably the 

0 carrier is a parenteral carrier, more preferably a solution that is isotonic with the blood 
of the recipient. Examples of such carrier vehicles include water, saline, Ringer's 
solution, and dextrose solution. Non-aqueous vehicles such as fixed oils and ethyl 
oleate are also useful herein, as well as liposomes. 

The carrier suitably contains minor amounts of additives such as substances that 

5 enhance isotonicity and chemical stability. Such materials are non-toxic to recipients at 
the dosages and concentrations employed, and include buffers such as phosphate, 
citrate, succinate, acetic acid, and other organic acids or their salts; antioxidants such as 
ascorbic acid; low molecular weight (less than about ten residues) polypeptides, e.g., 
polyarginine or tripeptides; proteins, such as serum albumin, gelatin, or 

0 immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids, 
such as glycine, glutamic acid, aspartic acid, or arginine; monosaccharides, 
disaccharides, and other carbohydrates including cellulose or its derivatives, glucose, 
manose, or dextrins; chelating agents such as EDTA; sugar alcohols such as mannitol or 
sorbitol; counterions such as sodium; and/or nonionic surfactants such as polysorbates, 

5 poloxamers, or PEG. 

The secreted polypeptide is typically formulated in such vehicles at a 
concentration of about 0.1 mg/ml to 100 mg/ml, preferably 1-10 mg/ml, at a pH of 
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about 3 to 8. It will be understood that the use of certain of the foregoing excipients, 
carriers, or stabilizers will result in the formation of polypeptide salts. 

Any polypeptide to be used for therapeutic administration can be sterile. 
Sterility is readily accomplished by filtration through sterile filtration membranes (e.g., 
5 0.2 micron membranes). Therapeutic polypeptide compositions generally are placed 
into a container having a sterile access port, for example, an intravenous solution bag or 
vial having a stopper pierceable by a hypodermic injection needle. 

Polypeptides ordinarily will be stored in unit or multi-dose containers, for 
example, sealed ampoules or vials, as an aqueous solution or as a lyophilized 

10 formulation for reconstitution. As an example of a lyophilized formulation, 10-ml vials 
are filled with 5 ml of sterile-filtered 1% (w/v) aqueous polypeptide solution, and the 
resulting mixture is lyophilized. The infusion solution is prepared by reconstituting the 
lyophilized polypeptide using bacteriostatic Water-for-Injection. 

The invention also provides a pharmaceutical pack or kit comprising one or 

1 5 more containers filled with one or more of the ingredients of the pharmaceutical 

compositions of the invention. Associated with such container(s) can be a notice in the 
form prescribed by a governmental agency regulating the manufacture, use or sale of 
pharmaceuticals or biological products, which notice reflects approval by the agency of 
manufacture, use or sale for human administration. In addition, the polypeptides of the 

20 present invention may be employed in conjunction with other therapeutic compounds. 

Example 24: Method of Treating Decreased Levels of the Polypeptide 

It will be appreciated that conditions caused by a decrease in the standard or 
normal expression level of a secreted protein in an individual can be treated by 

25 administering the polypeptide of the present invention, preferably in the secreted form. 
Thus, the invention also provides a method of treatment of an individual in need of an 
increased level of the polypeptide comprising administering to such an individual a 
pharmaceutical composition comprising an amount of the polypeptide to increase the 
activity level of the polypeptide in such an individual. 

30 For example, a patient with decreased levels of a polypeptide receives a daily 

dose 0.1-100 ug/kg of the polypeptide for six consecutive days. Preferably, the 
polypeptide is in the secreted form. The exact details of the dosing scheme, based on 
administration and formulation, are provided in Example 23. 
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Example 25: Method of Treating Increased Leve ls of the Polypeptide 

Antisense technology is used to inhibit production of a polypeptide of the 
present invention. This technology is one example of a method of decreasing levels of 
a polypeptide, preferably a secreted form, due to a variety of etiologies, such as cancer. 
5 For example, a patient diagnosed with abnormally increased levels of a 

polypeptide is administered intravenously antisense polynucleotides at 0.5, 1.0, 1.5, 
2.0 and 3.0 mg/kg day for 21 days. This treatment is repeated after a 7-day rest period 
if the treatment was well tolerated. The formulation of the antisense polynucleotide is 
provided in Example 23. 

10 

Example 26: Method of Treatment Using Gene Therapy 

One method of gene therapy transplants fibroblasts, which are capable of 
expressing a polypeptide, onto a patient. Generally, fibroblasts are obtained from a 
subject by skin biopsy. The resulting tissue is placed in tissue-culture medium and 
1 5 separated into small pieces. Small chunks of the tissue are placed on a wet surface of a 
tissue culture flask, approximately ten pieces are placed in each flask. The flask is 
turned upside down, closed tight and left at room temperature over night. After 24 
hours at room temperature, the flask is inverted and the chunks of tissue remain fixed to 
the bottom of the flask and fresh media (e.g., Ham's F12 media, with 10% FBS, 

20 penicillin and streptomycin, is added. The flasks are then incubated at 37°C for 

approximately one week. 

At this time, fresh media is added and subsequently changed every several days. 
After an additional two weeks in culture, a monolayer of fibroblasts emerge. The 
monolayer is trypsinized and scaled into larger flasks. 

25 pMV-7 (Kirschmeier, P.T. et al., DNA, 7:219-25 (1988)), flanked by the long 

terminal repeats of the Moloney murine sarcoma virus, is digested with EcoRI and 
Hindlll and subsequently treated with calf intestinal phosphatase. The linear vector is 
fractionated on agarose gel and purified, using glass beads. 

The cDN A encoding a polypeptide of the present invention can be amplified 

30 using PCR primers which correspond to the 5* and 3' end sequences respectively as set 
forth in Example 1. Preferably, the 5' primer contains an EcoRI site and the 3' primer 
includes a Hindlll site. Equal quantities of the Moloney murine sarcoma virus linear 
backbone and the amplified EcoRI and Hindlll fragment are added together, in the 
presence of T4 DNA ligase. The resulting mixture is maintained under conditions 

35 appropriate for ligation of the two fragments. The ligation mixture is then used to 
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transform bacteria HB 101, which are then plated onto agar containing kanamycin for 
the purpose of confirming that the vector has the gene of interest properly inserted. 

The amphotropic pA3 17 or GP+aml2 packaging cells are grown in tissue 
culture to confluent density in Dulbecco's Modified Eagles Medium (DMEM) with 10% 
5 calf serum (CS), penicillin and streptomycin. The MSV vector containing the gene is 
then added to the media and the packaging cells transduced with the vector. The 
packaging cells now produce infectious viral particles containing the gene (the 
packaging cells are now referred to as producer cells). 

Fresh media is added to the transduced producer cells, and subsequently, the 

10 media is harvested from a 10 cm plate of confluent producer cells. The spent media, 
containing the infectious viral particles, is filtered through a millipore filter to remove 
detached producer cells and this media is then used to infect fibroblast cells. Media is 
removed from a sub-confluent plate of fibroblasts and quickly replaced with the media 
from the producer cells. This media is removed and replaced with fresh media. If the 

15 titer of virus is high, then virtually all fibroblasts will be infected and no selection is 
required. If the titer is very low, then it is necessary to use a retroviral vector that has a 
selectable marker, such as neo or his. Once the fibroblasts have been efficiently 
infected, the fibroblasts are analyzed to determine whether protein is being produced. 
The engineered fibroblasts are then transplanted onto the host, either alone or 

20 after having been grown to confluence on cytodex 3 microcarrier beads. 

It will be clear that the invention may be practiced otherwise than as particularly 
described in the foregoing description and examples. Numerous modifications and 
variations of the present invention are possible in light of the above teachings and, 
therefore, are within the scope of the appended claims. 

25 The entire disclosure of each document cited (including patents, patent 

applications, journal articles, abstracts, laboratory manuals, books, or other 
disclosures) in the Background of the Invention, Detailed Description, and Examples is 
hereby incorporated herein by reference. 
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(C) STRANDEDNE S3 : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

30 

GGGATCCGGA GCCCAAATCT TCTGACAAAA CTCACACATG CCCACCGTGC CCAGCACCTG 60 

AATTCGAGGG TGCACCGTCA GTCTTCCTCT TCCCCCCAAA ACCCAAGGAC ACCCTCATGA 120 

35 TCTCCCGGAC TCCTGAGGTC ACATGCGTGG TGGTGGACGT AAGCCACGAA GACCCTGAGG 180 

TCAAGTTCAA CTGGTACGTG GACGGCGTGG AGGTGCATAA TGCCAAGACA AAGCCGCGGG 240 

AGGAGCAGTA CAACAGCACG TACCGTGTGG TCAGCGTCCT CACCGTCCTG CACCAGGACT 300 

40 

GGCTGAATGG CAAGGAGTAC AAGTGCAAGG TCTCCAACAA AGCCCTCCCA ACCCCCATCG 360 

AGAAAACCAT CTCCAAAGCC AAAGGGCAGC CCCGAGAACC ACAGGTGTAC ACCCTGCCCC 420 

45 CATCCCGGGA TGAGCTGACC AAGAACCAGG TCAGCCTGAC CTGCCTGGTC AAAGGCTTCT 480 

ATCCAAGCGA CATCGCCGTG GAGTGGGAGA GCAATGGGCA GCCGGAGAAC AACTACAAGA 540 

CCACGCCTCC CGTGCTGGAC TCCGACGGCT CCITCTTCCT CTACAGCAAG CTCACCGTGG 600 

50 

ACAAGAGCAG GTGGCAGCAG GGGAACGTCT TCTCATGCTC CGTGATGCAT GAGGCTCTGC 660 

ACAACCACTA CACGCAGAAG AGCCTCTCCC TGTCTCCGGG TAAATGAGTG CGACGGCCGC 720 

55 GACTCTAGAG GAT 733 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Trp Ser Xaa Trp Ser 
1 5 



(2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 86 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GCGCCTCGAG ATTTCCCCGA AATCTAGATT TCCCCGAAAT GATTTCCCCG AAATGATTTC 60 
CCCGAAATAT CTGCCATCTC AATTAG 86 

30 



35 



45 



(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
40 <D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
GCGGCAAGCT TTTTGCAAAG CCTAGGC 27 



50 



(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
55 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



CTCGAGATTT CCCCGAAATC TAGATTTCCC CGAAATGATT TCCCCGAAAT GATTTCCCCG 

60 



60 
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AAATATCTGC CATCTCAATT AGTCAGCAAC CATAGTCCCG CCCCTAACTC CGCCCATCCC 12 0 

GCCCCTAACT CCGCCCAGTT CCGCCCATTC TCCGCCCCAT GGCTGACTAA TTTTTTTTAT 180 

5 TTATGCAGAG GCCGAGGCCG CCTCGGCCTC TGAGCTATTC CAGAAGTAGT GAGGAGGCTT 240 

TTTTGGAGGC CTAGGCTTTT GCAAAAAGCT T 271 

10 

(2) INFORMATION FOR SEQ ID NO; 6; 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 32 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION; SEQ ID NO: 6: 

GCGCTCGAGG GATGACAGCG ATAGAACCCC GG 32 

25 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GCGAAGCTTC GCGACTCCCC GGATCCGCCT C 31 

40 



(2) INFORMATION FOR SEQ ID NO: 8: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



50 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GGGGACTTTC CC 



12 



(2) INFORMATION FOR SEQ ID NO: 9: 
60 (i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 73 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GCGGCCTCGA GGGGACTTTC CCGGGGACTT TCCGGGGACT TTCCGGGACT TTCCATCCTG 
10 CCATCTCAAT TAG 



15 (2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 256 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

25 CTCGAGGGGA CTTTCCCGGG GACTTTCCGG GGACTTTCCG GGACTTTCCA TCTGCCATCT 60 

CAATTAGTCA GCAACCATAG TCCCGCCCCT AACTCCGCCC ATCCCGCCCC TAACTCCGCC 120 

CAGTTCCGCC CATTCTCCGC CCCATGGCTG ACTAATTTTT TTTATTTATG CAGAGGCCGA 180 

GGCCGCCTCG GCCTCTGAGC TATTCCAGAA GTAGTGAGGA GGCTTTTTTG GAGGCCTAGG 240 
CTTTTGCAAA AAGCTT 
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35 



256 



(2) INFORMATION FOR SEQ ID NO: 11: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 582 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GGCACGAGGT AATTTCTACC AGAAATTTCC AGAGCATTAT GTAGGTAGAA AAAAATGCAA 60 

50 GCAAGCTGTT AAAGATCTTG GATCCCATTA TATAGTATGT ATAGCTGAAA TCTGTAATTC 120 
AATCACTTTT TCTCTTTTAT CCTCTAACCA AAAAATTGTT TAATTTTGCA TCCCAAATGT 180 
TTTTAATCTT TGTATATTTT TTAAAAATCC TTTTCTCCTC ATCATTGCCT TTTTTGTGGT 240 
TGTAAATAGA CTTACTTGCA CTTTGAAGAT GAGTTACTCC TTGTCATCTT ACAAATATGT 300 
GATATGGTAA TTTTCATAAC AGATGTCAGT TTTGAACCAA GAATTGGTGA TTTGTTTATA 360 

60 AGAAAAAAAC TGGCTTCATT TCTGTGAAAT TGCTCTTTGA AAATTTCTTT 7TACACGTGT 420 
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AAGCCAACTG AGATACCGTG ATGGTGTTGA TTTCTTTCAA TGATGCITAC CATCTATTTT 480 
AGCCACTGAG CCTTTTATTA TTTGTCTATT TGTAAAGTTT ATTTGTCTTA ACTCATTTAA 540 
TAAATATACT GTTTATCTGT TTCTGAAAAA AAAAAAAAAA AA 582 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 465 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS ; double 
<D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

20 

GTTTGGGGGT GAGGCCGAGC TGCTGCGGGG CTTCGTCGCC GGCCAGGACA CAGCTACTCG 60 

CACGGCGGCG GCGCCTGGCT ATGATGTTCC TCACCCAGGG CGGGCCTCTG CCCTCTACTC 120 

25 GTGCCAGGCC CACTTGCCAG GCAGGAGCCC TCCCCAAGCC TTCAGGGCTG CTCGGAGTCA 180 

CCTGTTGGAA TGGACTAAAA GGACCCTTGT GTGGGAACAG GTGCTCCCCA AACACCCTGC 240 

TGCTGGCTGC CAGGCAGGCC CTCTGGAAGG GAAGGGGCAG GACTCATCAG GACCTCCCTG 300 

30 

GACCCCTGCA GGGCAGGCAG CTTGGGCCCG AGCCCAAGCA TTTGGCTCTG CTGCCCCCAA 360 

GGGGACAGGA AGCCTCTTGG GCCTCTTCCC TTCCTGGACA AGGCCCCCTG CCTTTGCCTC 420 

35 ACATAAACTG TACAGTATTT TCATTAAAAG CCTCTTTCAT AAAAA 465 



40 (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 474 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

50 ATGCAATTCC TGCTCACAGC CTTTCTGTTG GTGCCACTTC TGGCTCTTTG TGATGTCCCC 60 

ATATCCCTAG GCTTCTCCCC CTCCTAGAAG GGCTTCTTGA TAGATTAGAA AATAAGAATG 120 

^ AGTGACATTT CCTATGTGCA TATAAGAAGG AGCCACAAGA CATGTCTTTT AAATAAAAGG 180 

ACAGTGTCCA TCCTTTTAGC TGCCGAATAG AACCTTGGTC TCATCCTCCT GGAGCTAGGC 240 

CTTTAAAACA GCTTCTGTGT TTCTCATTTG TCTCAGTGTT TTGCCAGGGT TTTATCGGAA 300 

60 AGATAATGTT CCGTTTAAAA TATTTCCTAA TGAGGCCGGG CGTGGTGGCT CACGCCTGTA 360 
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20 



30 



35 



45 



ACCCTAGCAM TTGGGGGCTG AGCGGGTGGA TCACGAGGTC AGGAGATCGA GACCATCCTG 420 
GSTAACATGG TGAAACCCCG TCTCTACTAA AAATACAAAA AAAAAAAAAA AAAA 474 

(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 314 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
15 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TTATGTTGGG GAGCAAGACC TGATAGCCAG CCTTTACATG GGAGTATAAT TCTGTCCTCC 60 

ATCTCATAAG CCCCAGTACC TGAGCCAGAA TGATTATAAC CAACCACACT GTCTCTTTAT 120 

CATGGATGGC TTTAGCAGTA GGTTATTTTC ATCATTGCCA TTTGTAGCTC TACAGTGGTT 180 

25 TATAGTAATT TCTCATCTTT TAAGTCTCTC CCTCAGTGCC TGTTGTTATC AAACTCATTG 240 

CTCTCTCANG CAGTTGAGCT CTGCATTCTC CCYTATGGGG GAGAGCTGTG TTGGAGAGAG 300 
AGAATATNAC TTCC 



(2) INFORMATION FOR SEQ ID NO: 15: 



(iJ SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 

40 (D) TOPOLOGY: linear 



314 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CTCATATTGC CGTCTGGCTA AAAGTGAACA TGCCATTGAT CAATCTGCTT TTATTATATT 60 

ATGTTCCTAA TGGTGGCAAG CAAGACAAGA AGTAGAAAGA AAGATGGTGT AAGCTCAAGA 120 

ACCCACTAAA TCTATCCTAT GGCCTGGGTT CACCCAGCCT GCTTTGTGGA TTTTGTCTCA 180 

50 CTATAACAGA GCTCCCAAGG AGACTGCAGA GTCAGCTCCC TTAAGCACTG TAACTAAAGC 240 

CTAACTCTTC CGTTCCACCC AACAATGTYC CCAGCTCATC CTCTTTCCCR AAGTCCCCTT 300 

^ TCTGCCCCAG ATGCGAATTG CATTTAACTA ATCCTCAAGT GAAATGTCCA CACAGRATTC 360 

CATTTTAATT AGCATACCAT AGTTTTTGTG CAAATTTGCT TTCAGARGAC TCCCATTGCA 420 

GCTGCTCAGA GACGCTAAWG GCAGGGCCTC TTGAWGCTTT CCCGATAGCT TTCAGCTGCA 480 

60 ATAGCTCTTA GGCAGAATGC CATGAGCGTC CTGCCCAACT GTATTACTGG GGAACACCTG 540 
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ATTGGCTAGA AGTTGATCCT CCTGTAACTT TTCTGAGTTC TTTACATTTA CTCGTGAAAC 600 
CCAAATATGC CAC fi1 
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(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 356 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
15 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CCCCCCCCAT TGAACCCTGG GCTGTGAAAG TTTTTGCCTG TGTGGGTCGT TCTGTGTGGC 60 

GCCTGGTGTG TGGKTCCCAA CTCCTGTTGC AAAGTGGCAG CAGCCAATCA TGAAGCGCCC 120 

TTATTTTTAG TTGCAGATGA CCAGGTCTCC CCCCCACAGC CTCTGTCTGG TCCCTCATTG 180 

25 GTGAGTGGTC TGCCTGCCCA AGGAGCCTGA TTGGTGGGAA ATGGCATCAT CTAATATGAT 240 

GGGAAGGCAT TTGGTCCTGG TTATGTTTAT TACAACATCA TTGCACTCTG GGACTCCAGT 300 

^ CCCTGAAAAC GTAATTTGTG GTGTTACCAA AGGACCACAG GGGAAAAAAA AAAAAA 356 



(2) INFORMATION FOR SEQ ID NO: 17- 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 

40 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GAAACTANAT CCCGGGGCTT TTAACNGGTA CTTGGGAAAT AAGTATTGGG TAATCACTAA 60 

45 

GNGGACATTG ACTGCACCAA ACCAAAGCTA TAGAAAGAAA TGATTGACTT TTTAAAATAT 120 

ATTCACATTA ACTGTCCTAG GATACTTCTC TTGAGGCTTT GGAAAACTTC TTCCTTGAAA 180 

50 TTTGCATATC CACTCCAGTT CTGTCACCAA AGATTTTAAT CTTCAGATCG CAATTTCCIC 240 

TCTCCCAGAA AAAAGTACTA CAACAGGCTC AAGGGATATG CTTTGGTGGT CAAGGGATTA 300 

^ CACTATGGTT TTCCTTCTGT TCACAATGGT ATTTACAGGA GACCTTGTCA TCAGAGGACG 360 

TACTGAACTA TC7TTTATGAC TTTGGATTTG ATCAGAGGTT TAAAAAAAAA AAAA 414 



60 
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(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 469 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

10 

AATCACCATT GCAATACAAA TGATCTGCCT GGTGAATGYT GAGCTGTACC CCACATTCGT 60 

CAGGAACYTC GGAGTGATGG TGTGTTCCTC CCTGTGTGAC ATAGGTGGGA TAATCACCCC 120 

15 CTTCATAGTC TTCAGGCTGA GGGAGGTCTG GCAAGCCTTG CCCCTCATTT TGTTTGCGGT 180 

GTTGGGCCTG CTTGCCGCGG GAGTGACGCT ACTTCTTCCA GAGACCAAGG GGGTCGCTTT 240 

GCCAGAGACC ATGAAGGACG CCGAGAACCT TGGGAGAAAA GCAAAGCCCA AAGAAAACAC 300 

20 

GATTTACCTT AAGGTCCAAA CCTCAGAACC CTCGGGCACC TGAGAGAGAT GTTTTGCGGC 360 

GATGTCGTGT TGGAGGGATG AAGATGGAGT TATCCTCTGC AGAAATTCCT AGACGCCTTC 420 

25 ACTTCTCTGT ATTCTTCCTC ATACTTGCCT ACCCCCAAAT TAATATCAG 469 



30 (2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 550 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 19: 

40 CCCCCCCCCC CCCCCACACT TTCAGGAGTC ACCCCCCAGC ATTTGGGGTT GGGTTGGCCC 60 

TACTCCAGCC TGGAGCTCCC TGAGGGAGCC TGCACTCCCT GCTCCCAATC CCCGCTACTG 120 

GTGCAGGGAT GCAGCCTGGA GCTGGCGTCC TTGTTCTGGG CCTCCTGCTG CCGCCACCCC 180 

45 

AGAGCCCCAG CCTGTCCTGA ATTGACATCA GTGCTTCCCT GAACTGCCTC CCCCACCCCT 240 

GGGCATTATC CCAGGAAACT TTATGTTTTC TAGAAGCTAA GCAGCTGCTG GGACTCAGGG 300 

50 ACTGGTGCAG GTAGGCTGAG TGGCAGCTCA GTCCTAGAAG GTCTCTGAAG ATCTGGACTG 360 

AGGACCTTGC TACTCCCCAA GCCAGAGCCC ATCAGCCAGG CCTGCTGTGA GCCACCTGCC 420 

TGTGGAGTGC TGAGCTCAAC CAAAGGCTGG CAAGCTCTGG GCCTCATTTA AGGGATTCTG 480 

55 

ATGAGCCGAT GGGCCCTGGA GGCAGCCCAT TAAAGCATCT GGCTCGTTTT TGGAAAAAAA 540 

AAAAAAAAAA 550 

60 
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(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 741 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
TCTTGAAGAG TGTACAGTAC AGGATTATTA TAATGAAAGT TTATATCAAC AGGGTTTCGT 
TGGCTCTGCA TATATTATAA GCAAAAGAGA TTGGTAAAGT GCCACAGTAT TCCAGATAAC 
TTTTCAGTTG CGGCCTTTCT TCTCGTTCTT TAATTTGAAA CCTAGATACA TGCAGTAAAA 
ACTAGGAGAA TGACTTTTAC CCTTGGGGAC AGCCAAGTTT TGTTGATAAA CCTATTTCCT 
AGCATGCCTT CAGGAAGTTG TGCCAGACCC TAGATTGTGA AGGACCCACT GTTCTTCTGT 
TGTACGAGCT CCCTGAACCA TTGTTCAGAG GACCAATGTC ACATCGCTTC ATGGGCATGG 
NCCATGGGAG CATCTGGGTG ATAYCTGTCT ACAGTATTGG CTCTTCTGCG AGGCTGATAC 
ACAAGGOCTC TCTTCCACAT GATCATTTGC AAACCTCCCC CAGCCCCTAC CATCCAATGT 
GGAAGGAAAA CAAGAACTGC CTGAAGAAGA GTCCAAGCTA CAGATACACA GCGTGTGCAT 
TGCGGCTGTC ACCTTCCTCC TCCCACTTCT GTATCCTCAG AGATGCTGCG TGGATGTTTC 
CTTAACCTCA GCTGACTTCC CTGTGAATGT CTAATGCTAG TTCAGGGCCT CCAGGCATTC 
ATTTGTACAG TGGTAACTCC CAATGAGGCT TCTGTTATCA TTTGGTGTGC TTTYTCTGTC 
ATTAAAAGAA ATGATTTTCC C 

(2) INFORMATION FOR SEQ ID NO: 21: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 991 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GGCACGAGTC TCCCCTGGGG AAGTTTTTCT TTTTCAGGAG GGAGGAGGGC TTTCCCAGGT 
AATGTGTCTA GAGTGTTGGG CAGAAAATCT GGGACCACAC CACACCAGTT CTCTCCTTAA 
TCCACGTCAT TTGCCTTCTA TCCCAGCTAT GTTTCCAGTG TCCTCTGGGT GTTTCCAAGA 
GCAACAAGAA ATGAATAAAT CTCTGGTGAG TTGTTTATTT GTTCTTCACT TTGTTTTACA 
CTGTATTTTC TGAGTTTATG GGTGTCTGTG AATTAAAAAG GAAAAGTAGA AATAAGTAAA 
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ACTCAGGTTG AAGGAAATAT ACATAAATAA GATAAAGCTG ACCTGTAGAT ATAGCAGGTT 
ATAAAGCTTA GAGTTGTCTA AGTTGAGTGC AAATTTTCCT CTGATCTTTC TGATGCCGAA 
CAAAAAAGCA GTCATGTTTG TTATGTGATT GGAATGGAAC CCGAGAAGAG AGCATGCTGT 
GTTCITGTGG GACAGGAAAG CTTGCGTGCA CCAAGTCTGA ACCACCACCT TCATGGTGAC 
ATAGATTATG TGCTGGAACA TATTTCACAC CGGCCTGGCA GTAAACACTT GTAGTGTTGT 
GCAGTGGAAA CGGTCATCTT CCGCTAAAGC ACGGCGTGTT GTGCAGCGGA AATGGTCATC 
TGCTGCTAAA ACACAGCTTC CATCGTAATG TATGCTCCTT ACTCAAAGAG TGTGGTCCCA 
AACAGCCTTT GGGAGGTCCT CCTTGATTCA TGGATGAAAC CTGGAACATC TTGAGGACTG 
AGTTAACCAT AGGTCCTTAA ATAACTCTCC ACACGTTTTT CTTAGTTTAT CTCTACATGC 
AGGGTGTGCA GCAGCCTGTT CAAAGTCATA TTTTCTGGGA AATATTTCCA GTGTTTATTT 
GCACTTTAGC CCACTCTGTG TAGCCTTATT TCTTCTAAAC TCACCATTAA TCTGAATAAT 
AGTCAAATTT AGGGGGACTG TATTTGCCTT A 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 653 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
CCACGCGTCC GGAATTCCCC TGAGGATCTT GGGCTATCTT TGACAGGGGA TTCTTGCAAG 
TTGATGCTTT CTACAAGTGA ATATAGTCAG TCCCCAAAGA TGGAGAGCTT GAGTTCTCAC 
AGAATTGATG AAGATGGAGA AAACACACAG ATTGAGGATA CGGAACCCAT GTCTCCAGTT 
CTCAATTCTA AATTTGTTCC TGCTGAAAAT GATAGTATCC TGATGAATCC AGCACAGGAT 
GGTGAAGTAC AACTGAGTCA GAATGATGAC AAAACAAAGG GAGATGATAC AGACACCAGG 
GATGACATTA GTATTTTAGC CACTGGTTGC AAGGGCAGAG AAGAAACGGT AGCAGAAGAA 
GTTTGTATTG ATCTCACTTG TGATTCGGGG AGTCAGGCAG TTCCGTCACC AGCTACTCGA 
TCTGAGGCAC TTTCTAGTGT GTTAGATCAG GAGGAAGCTA TGGAAATTAA AGAACACCAT 
CCAGAGGAGG GGTCTTCAGG GTCTGAGGTG GAAGAAATCC CTGAGACACC TTGTGAAAGT 
CAAGGAGAGG AACTCAAAGA AGAAAATATG GAGAGTGTTC CGTTGCACCT TTCTCTGACT 
GAAACTCAGT CCCAAGGGTT GTGTCTTCGG AGGCATCCAA AAAAAAAAAA AAA 
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(2) INFORMATION FOR SEQ ID NO: 23: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1486 base pairs 

(B) TYPE : nucleic acid 
<C) STRANDEDNESS : double 

10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

GGCAGGCTGA CGACCTGCAA GCCACAGTGG CTGCCCTGTG CGTGCTGCGA GGTGGGGGAC 60 

15 

CCTGGGCAGG AAGCTGGCTG AGCCCCAAGA CCCCGGGGGC CATGGGCGGG GATCTGGTGC 120 

TTGGCCTGGG GGCCTTGAGA CGCCGAAAGC GCTTGCTGGA GCAGGAGAAG TCTCTRGCCG 180 

20 GCTGGGCACT GGTGCTGGCA SGARCTGGCA TTGGACTCAT GGTGCTGCAT GCAGAGATGC 240 

TGTGGTTCGG GGGGTGCTCG GCTGTCAATG CCACTGGGCA CCTTTCAGAC ACACTTTGGC 300 

TGATCCCCAT CACATTCCTG ACCATCGGCT ATGGTGACGT GGTGCCGGGC ACCATGTGGG 360 

25 

GCAAGATCGT YTGCCTGTGC ACTGGAGTCA TGGGTGTCTG CTGCACAGCC CTGCTGGTGG 420 

CCGTGGTGGC CCGGAAGCTG GAGTTTAACA AGGCAGAGAA GCACGTGCAC AACTTCATGA 480 

30 TGGATATCCA GTATACCAAA GAGATGAAGG AGTCCGCTGC CCGAGTGCTA CAAGAAGCCT 540 

GGATGTTCTA CAAACATACT CGCAGGAAGG AGTCTCATGC TGCCCGCANG CATCAGCGCA 600 

ANCTGCTGGC CGCCATCAAC GCGTTCCGCC AGGTGCGGCT GAAACACCGG AAGCTCCGGG 660 

35 

AACAAGTGAA CTCCATGGTG GACATCTCCA AGATGCACAT GATCCTGTAT GACCTGCAGC 720 

AGAATCTGAG CAGCTCACAC CGGGCCCTGG AGAAACAGAT TGACACGCTG GCGGGGAAGC 780 

40 TGGATGCCCT GACTGAGCTG CTTAGCACTG CCCTGGGGCC GAGGCAGCTT CCAGAACCCA 840 

GCCAGCAGTC CAAGTAGCTG GACCCACGAG GAGGAACCAG GCTACTTTCC CCAGTACTGA 900 

GGTGGTGGAC ATCGTCTCTG CCACTCCTGA CCCAGCCCTG AACAAAGCAC CTCAAGTGCA 960 

45 

AGGACCAAAG GGGGCCCTGG CTTGGAGTGG GTTGGCTTGC TGATGGCTGC TGGAGGGGAC 1020 

GCTGGCTAAA GTGGGKAGGC CTTGGCCCAC CTGAGGCCCC AGGTGGGAAC ATGGTCACCC 1080 

50 CCACTCTGCA TACCCTCATC AAAAACACTC TCACTATGCT GCTATGGACG ACCTCCAGCT 1140 

CTCAGTTACA AGTGCAGGCG ACTGGAGGCA GGACTCCTGG GTCCCTGGGA AAGAGGGTAC 1200 

TAGGGGCCCG GATCCAGGAT TCTGGGAGGC TTCAGTTACC GCTGGCCGAG CTGAAGAACT 1260 

55 

GGGTATGAGG CTGGGGCGGG GCTGGAGGTG GCGCCCCCTG GTGGGACAAC AAAGAGGACA 1320 

CCATTTTTCC AGAGCTGCAG AGAGCACCTG GTGGGGAGGA AGAAGTGTAA CTCACCAGCC 1380 

60 TCTGCTCTTA TCTTTGTAAT AAATGTTAAA GCCAGAAAAA AATAAAAAAA AAAAAAAAAA 1440 
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AACTCGAGGG GGGCCCRKAC CCAATCWCCC TATAGTAKAC GTANNN 1486 



(2) INFORMATION FOR SEQ ID NO: 24: 

(l) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 2323 base pairs 

{B) TYPE: nucleic acid 
(C) STRANDEENESS : double 
(D> TOPOLOGY: linear 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

CTTCGCCGTT TCTCCTGCCA GGGGAGGTCC CGGCTTCCCG TGGAGGCTCC GGACCAAGCC 60 

CCTTCAGCTT CTCCCTCCGG ATCGATGTGC TGCCGCCGCC GCCGCCGCCG TCCCGCGTCC 120 

20 

TTCGGTCTCT GCTCCCGGGA CCCGGCTCCG CGCAGCCAGC CAGCATGTCG GGGATCAAGA 180 

AGCAAAAGAC GGAGAACCAG CAGAAATCCA CCAATGTAGT CTATCAGGCC CACCATGTGA 240 

25 GCAGGAATAA GAGAGGGCAA GTGGTTGGAA CAAGGGGTGG GTTCCGAGGA TGTACCGTGT 300 

GGCTAACAGG TCTCTCTGGT GCTGGGAAAA ACAACGATAA GTTTTGCCCT GGAGGAGTAC 360 

TTGTCTCCCA TGCCATCCCT GTTAATTCCT GGATGGGGAC AATGTCCGTC ATGGCCTTAA 420 

30 

CAGAATCCCC CAGATGGCTT CATGGCCCCC AAAGCATGGA AGGTCCTGAC AGATTATTAC 480 

AGGTCCCTGC AGAAGAACTA AGCCTTTGGT CCAGAGTTTC TTTCTGAAGT GCTCTTTGAT 540 

35 TACCTTTTCT ATTTTTATGA TTAGATGCTT TGTATTAAAT TGCTTCTCAA TGATGCATTT 600 

TAATCTTTTA TAATGAAGTA AAAGTTGTGT CTATAATTAA AAAAATATAT ATATATATAC 660 

ACACACACAT ATACATACAA AGTCAAACTG AAGACCAAAT CTTAGCAGGT AAAAGCAATA 720 

40 

TTCTTATACA TTTCATAATA AAATTAGCTC TATGTATTTT CTACTGCACC TGAGCAGGCA 780 

GGTCCCAGAT TTCTTAAGGC TTTGTTTGAC CATGTGTCTA GTTACTTGCT GAAAAGTGAA 840 

45 TATATTTTCC AGCATGTCTT GACAACCTGT ACTCTTCCAA TGTCATTTAT CAGTTGTAAA 900 

ATATATCAGA TGTGTCCTCT TCTGTACAAT TGACAAAAAA AAAAATTTTT TTTTCTCACT 960 

CTAAAAGAGG TGTGGCTCAC ATCAAGATTC TTCCTGATAT TTTACCTCAT GCTGTACAAA 1020 

50 

GCCTTAATGT TGTAATCATA TCTTACGTGT TGAAGACCTG ACTGGAGAAA CAAAATGTGC 1080 

AATAACGTGA ATTTTATCTT AGAGATCTGT GCAGCCTATT TCTGTCACAA AAGTTATATT 1140 

55 GTCTAATAAG AGAAGTCTTA ATGGCCTCTG TGAATAATGT AACTCCAGTT ACACGGTGAC 1200 

TTTTAATAGC ATACAGTGAT TTGATGAAAG GACGTCAAAC AATGTGGCGA TGTCGTGGAA 1260 

AGTTATCTTT CCCGCTCTTT GCTGTGGTCA TTGTGTCTTG CAGAAAGGAT GGCCCTGATG 1320 
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CAGCAGCAGC GCCAGCTGTA ATAAAAAATA ATTCACACTA TCAGACTAGC AAGGCACTAG 138 0 

AACTGGAAAA GACCACAGAA AACAAAGAAT CCAACCCTTT CATCTTACAG GTGAACAAAC 1440 

TGTGATGATG CACATGTATG TGTTTTGTAA GCTGTGAGCA CCGTAACAAA ATGTAAATTT 1500 

GCCATTATTA GGAAGTGCTG GTGGCAGTGA AGAAGCACCC AGGCCACTTG ACTCCCAGTC 1560 

TGGTGCCCTG TCTACACCAG ACAACACAGG AGCTGGGTCA GATTCCCCTC AGCTGCTTAA 1620 

CAAAGTTCCT CGAACAGAAA GTGCTTACAA AGCTGCCTTC TCGGATACTG AAAQGTCGAG 1680 

TTTTCTGAAC TGCACTGATT TTATTGCAGT TGAAAAAAAA AAAAAGCTAT TCCAAAGATT 1740 

TCAA3CTGTT CTGAGACATC TTCTGATGGC TTTACTTCCT GAGAGGCAAT GTTTTTACTT 1800 

TATGCATAAT TCATTGTTGC CAAGGAATAA AGTGAAGAAA CAGCACCTTT TAATATATAG 1860 

GTCTCTCTGG AAGAGACCTA AATTAGAAAG AGAAAACTGT GACAATTTTC ATATTCTCAT 1920 

TCTTAAAAAA CACTAATCTT AACTAACAAA AGTTCTTTTG AGAATAAGTT ACACACAATG 1980 

GCCACAGCAG TTTGTCTTTA ATAGTATAGT GCCTATACTC ATGTAATCGG TTACTCACTA 2040 

CTGCCTTTAA AAAAAAAAAC CAGCATATTT ATTGAAAACA TGAGACAGGA TTATAGTGCC 2100 

TTAACCGATA TATTTTGTGA CTTAAAAAAT ACATTTAAAA CTGCTCTTCT GCTCTAGTAC 2160 

CATGCTTAGT GCAAATGATT ATTTCTATGT ACAACTGATG CTTGTTCTTA TTTTAATAAA 2220 

TTTATCAGAG TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2280 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 2323 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 683 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

GGCACGAGCC TGTGTGGTCA TGTTCCTCGT GGTGCAGTAC CTGACATGAG CCAGCCACGC 60 

TCAGTGGCTG AACAGCATTC CCACAGCCTG CAAGTGTGTC TGTGTGTGAA AGAGAGAGGG 120 

GGGCCCAGAG CCGCCTTTTG AAATGTTTGC CTGTCTGAAC TGTGAAGACA CTTGGGAGTG 180 

ATTGTGGTCT AATTTCCAAC CTGCTCTGTT TTCTGTGACA TCTTGGAGGG GAGCTAGTGC 240 

CACACCATGC GCGGTGCTTA GAAATGAAAA AGTCCCGGGT CTGTCTCTCT CACTCTCGCT 300 

CTCATGGGGG AGGGAAAGAA TGGCTTTGGT GGCTTTGTTC ACACAGCTGA TGCGTGCTGG 360 

GAAGGTGTCC ACAGTGAGCC TGTGTGCAGG ACTGTCCACA CGGTTCACAC TTGTCACCAT 420 
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CAGGCCTTTC TGGTCCTGAT AGGGTGGAGC AAAAGTGGAA AGGAAAGGAA AGAGGCTTTT 480 

^ CTCACAGCCA TTATATTAAA TAGTAGGTCG ATTCACATCT CGTGCTCCTG GCCACCTTCC 540 

CCTGTGCCTC AGTGACATGT AGATGACTGA CTGCCAATAC TTGTCACCAT TCCCTGGAAG 600 

CAGCTACCTA GGGGAAACAA GATGTAGTGC TATTGCCGAT AACAAGTAAG ATTTTCCACA 660 

10 CTAAAAAAAA AAAAAAAAAA AAA 683 



15 (2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2036 base pairs 

(B) TYPE: nucleic acid 
20 (CI STRANDEDNESS : double 

(D> TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

25 CTGAGAAAGG AAAGCATTCG GATCTGCTGC AAAAACACAT ATATCCATAA AGACTCATGT 60 

TATTCAGAAA ACAGATTGTG AACACAATCA CATTCGCATG AATCCTTTAA AAGGAAGAAG 120 

ACCTTAAAGT ATCTGCAAAT CTGAATTTCT ATTTATTCCT TCACTGAATA TAGAAACAAT 180 

30 

GGTTATCTGA TTATTAGAGA TATTATTTTG GATATGTTAC TTATTAACTT GCTATGGCTG 240 

GTAACCATGA TAAAGTCTGT TATTAATAAC AACATAATTC TTTTTTTAAA GAAGAAAAGC 300 

35 TTATTTTTCA TTGACAGTGT ATAGATTTAT CTACTTAGTT GTGTTTTGCT ATTAGTGTTT 36C 

TAATTTTTTT TTTAAGTTGA GTGTTTGATA AATTTTAAGA CCCTGTCCCC ACCTTGTTTT 420 

GAGTCCTGTG TTGACTACAG GTATATAGCY CAWTTTAAAA ATCCTAAAGC AAAAGAATTT 480 

40 

TATTTATAAA AGAATCMAMC MGTTGCATGC ATGAGGCTGT GAAGTCAGAT ATTTAGTAAT 540 

AAAAGCAGCA GTGCCTTTTT TTGTATTTAC CCATTGACCC CCACCAAATG CAACTGTTTT 600 

45 ATATTAAGAA AATAGTAACA ATTTTAAAAT CTCAGAGTAA AATCTATTTC ACTACATGCT 660 

TTTCCCCCCT TGTTCTGATT TAAGCAGTGT GTACTTGGCA TCTCTACATT GTCCTAGGGA 720 

CAGTGGTGTT CTACAATATT ATCATGTATG ATGTTTTATT GGTGCTTTTT ATTCATAGTG 780 

50 

GCTTCTTACC AGAAACAGTA GGAAGAAACA CATGAACTGT GTACAAGACA TGAAACATTG 840 

CTGCTGATAT GTTGTTTTTT CACATGCTTT TGAGTTTTCA CTTTTTAAAC GAGAGCCAGC 90 0 

55 AAGCAAAATA GATGTGGCTG GGTCTGCCTG TCCGGGCGGC TYTTTGCACC GAGCTCTCAA 960 

ATCCTGTGTA TTGAGGGTTC CTTTTTGGTA CTCAGGATTG GAGCTACAGC TGGGCCCCCC 1020 

TCTCTCCCAT TCGTTTGAAG AGACACTGAG GGAAACAAGG GTTTCTTTTG AGGTGTCCTT 10 8 C 

60 
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GGCTGCCTTT TACGOGATGG GAGCCTTCTC CGGATCTTTT GTTCTTCTGC ACCTCTTGTA 1140 

GCTACTCCCG GTGCAAGGTT GTAGATGTTA TTCCCCAGGA GCCTGGGCTK GGGGGCTGAG 1200 

CTGGGCTGAA TGCAAAAGCA TGCAACCAGA AGGCGGGCAA GGGGAGGAAA AGCAGGCCTG 1260 

GCCTCATTGG TCCCCTGGAG ATGTCTGTAG CAGTCAGCTC CAGCTTGGGC CTGGGGAAGC 1320 

AGCCTGACCA AGGCGCTCAG GTGTGCCTGT TACAAGAAGA ACCTGCAGAA GGATAATTTG 1380 

CACATGGAGC TGTGATAACA CTAATGTTGA 'ITiT m T IT TTTTACAAGT CATCAGRGAT 1440 

GTTTGCAAAG TGAGTTTTAT TTTTTTGTAA TTCCTTTATC TTTACTTAAA GGTGAATGTG 1500 

TATTCCTCTG GGAGGAATAG GAAGAAAACA GGAATGTTAA TAATGTCGAA CAGAAAACTT 1560 

CCTCCCTTAT TAATATATAA TCYTCATGTA TTTATGCCNT AATCTAAGCT GACTTTTAAA 1620 

AAGCTTTCTT TTGTTGCATG CCCTGTGCAG GCATCTGTAT TGTACATGCA TGCCTTTCGT 1680 

CCTGTTTTCC TGTATAAAGT TAGTGAACAA AGAAATATTT TTGCCCTAGT TCATGTTGCC 1740 

AAGCAATGCA TATTTTTTAA ATTTGTCATA TATGGAAAGA GCATGTTTGT TACATGTAAA 1800 

AGCTTTACTG ATATACAGAT ATACTAATGT TTGAAGATGC TGTTCTTTGC AAGTGTACAG 1860 

TTTTCAAATG TTGTTACCAG TGAAACACCC TTGTGGTTTA AACTTGCTAC AATGTATTTA 1920 

TTATTCATTT CCTCCCATGT AACTAAGAAT CATGGCTATA TTTCATATCA ACGTTATATT 1980 

GAAAGTGAAG GGAAATGATT AATACAAGGT TTTGTAACAA AAAAAAANAA ANNAAA 2036 
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(2) INFORMATION FOR SEQ ID NO; 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 717 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
CD) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

GGCACGAGAT AACATAGGCA CAATAATACT GTATGTCTAC TTCTAGGATT ATAAGGAATT 60 

AACATTGAGA TGACATTTCC ATTTGAGAAG AAAATAGTTG CTTTCAGTGC CTTTTATTTG 120 

ATTCCTGGAG AGAGCAGACT CGCACCAACA TTCAACCCCA GCGCTGATAT GACAGTAATC 180 

CTCAGAGGCA GAGCCCAGCA CAAAACAGCA ATGCTAGAAA GTTACAATTG GAAAGTTTCC 240 

TGCCAGCTTC GGGAATGACA CTGCAAAGCT GATGCCAGAA ACTGCCAGAG TAATTCTCCT 300 

CATTACTGCT CTACCCACCC ACTTTCAGCT CCCCAAATTA ACTAGTGCAG TTGACTAATC 360 

CTCTTTACCT TTATCATTTA GGTGAGGCAT TGCACAAAAA CTCTCGACTT TGCCATATAA 420 

GGGCTGTGGT TCTCTGTGGT CCTGGATAAG AGGCATCACC ATTATCTGGA AACATGCAGT 480 
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AAATGCAGAT TCTTCATCTT CTCCCCAGAC CTCCTGAGTT AGAAATTCAC AAGTTCTCCA 540 

GGTGATCTCA TACATGCTAA AGTTTGAGAA CCATTGAGTA AAGTTAATGC ATTAAGAAGA 600 

5 

GATTAGATAG GGATGGTGGC GTATCTTCCT ACAGTTTCCC TGTTAACAAG AAAGTCAGAG 660 

GTCAGTTGAT CAGACATTAG ATTATTTATT GCTAAAACTA AAAAAAATTA AAAAAAA 717 

10 



(2) INFORMATION FOR SEQ ID NO: 28: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 495 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
GAATTCGGCA CGAGCAGCAT CCTAATTTTA GTTTGGAGAT GCATTCTAAA GGATCTTCTC 60 
25 TATTGCTTTT TCTCCCACAA TTAATCTTGA TTCTGCCTGT CTGTGCACAT TTGCATGAGG 120 
AACTGAACTG TTGTTTTCAT AGGTAAATGA GAGACTGAGT TTTTTCATTT CTGAAGAGAA 180 
AGGGCATTTG CTCCTACAAG CTGAAAGGCA CCCCTGGGTG GCTQGGGCCC TCGTGGGAGT 240 

30 

TTCTGGGGGA TTGACCCTTA CAACATGCAG TGGCCCTACA GAAAAACCTG CAACTAAAAA 300 
TTATTTTTTA AAAAGGCTCC TCCAGGAAAT GCATATAAGG GCTAATCACC CAGTATTTTG 360 
35 ARGCTTCGAA GARGTAATAR AMCCCTGGAG AGAGAAACTG AGACATGTAA GAGGGTGGGA 420 
ATGACTCAGT GGTGGCACAC TATGGAGTCC TGCCCACAAG TAGCACACAT CAACCCACTA 480 
CACAGAAATC CTAGG 495 

40 



45 



(2) INFORMATION FOR SEQ ID NO: 29: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 556 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
50 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

AGCTTAACGT CATGATTCAT TAGGGGAATG CAAGGCAAAA CCATGATGAG AATGCCCCTA 60 

55 

GACACCTCTT AGAAGAGCTG CTAGAAAGGC AGACAGCACC AAGCGCTTAA ATGAGATGGG 120 

GGCACTOGTG CTTCTTCTGT GCCTACTGGT AGGGGTGCAG CAGAGTGGTT CAGTCTGGGA 180 

60 CAGTTAGCTG GACATCACGT GGACCCAACA CACGCATTTC CTGGGTTACT TACCAAGGAG 240 



257 



AATAGAAAGC AGGCAGATCT TTACAGCAGC TCTTACCTGW TTGCAAAACA ATGGAAATGC 
CCACATGTCC ACAAACAAGT KTGTGGTCTG CCTGTGCCAT GAAGCACAGT GTGGCTGAGC 
GTCAAGAGTC CCCACACTCA AAGGAGGCAG CAGATACAGG GCTGCACACT GTGTGATTCC 
ACACATGTGA CATTCTGGAC ACGGACATGC TGGATGGCAA AACGAGCATC GGGCTGAGAG 
GACTGCTGAG AAGGGGAACG GGGCTGCTGG GATGTGGGTT GATTGTAGCA GTAGCTCATG 
GAGATGTGAC CTCAAA 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 434 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

CTAAATGGTG ACTGTGGCTT TGTCGAGACA GGCCCCAAAT GGTAGGTGTG AACACAACAT 

GCACAGAATG AGGAGACATG CAGAGTGCTG AAATACTGTC CTGGACAGAT GTGTTACATG 

ACTTTCTTTT CAGCTTATTT CTGTGGCCTG CCTTTGAAGA TAGAGCTTTG TTGATATTTA 

CATTAAACCA AATTGTATAA YTATGTTCCA TTCTGACATG TTATTTAGCA AARGAAAAAR 

GAGTAATTCT ACATCAGCAT CTTTAGTGCA TGCTAAAAGA TTAAAAATGT CTTTTGGGGA 

ACATGTTTTG TATACATAAA TGTTTAGATA GAAATATTTA TAGAATNCTC TATGTGAGTA 

TTNATCTCCC TATGTATATT TATATCTAGA TGTGTCAATC TTTGTATTGA TATGAAATGC 

TATGAATAGT GAGA 

(2) INFORMATION FOR SEQ ID NO : 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 715 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 31: 
CCACGCGTCC GATCTCACAG CTCCGACACT ATTGCGAGCC ATACACAACC TGGTGTCAGG 
AAACGTACTC CCAAACTAAG CCCAAGATGC AAAGTTTGGT TCAATGGGGG TTAGACAGCT 
ATGACTATCT CCAAAATGCA CCTCCTGGAT TTTTTCCGAG ACTTGGTGTT ATTGGTTTTG 
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CTGGCCTTAT TGGACTCCTT TTGGCTAGAG GTTCAAAAAT AAAGAAGCTA GTGTATCCGC 240 

^ CTGGTTTCAT GGGATTAGCT GCCTCCCTCT ATTATCCACA ACAAGCCATC GTGTTTGCCC 300 

AGGTCAGTGG GGAGAGATTA TATGACTGGG GTTTACGAGG ATATATAGTC ATAGAAGATT 360 

TGTGGAAGGA GAACTTTCAA AAGCCAGGAA ATGTGAAGAA TTCACCTGGA ACTAAGTAGA 420 

10 AAACTCCATG CTCTGCCATC TTAATCAGTT ATAGGTAAAC ATTGGAACTC CATAGAATAA 480 

ATCAGTATTT CTACAGAAAA ATGGCATAGA AGTCAGTATT GAATGTATTA AATTGGCTTT 540 

^ CTTCTTCAGG AAAAACTAGA CCAGACCTCT GTTATCTTCT GTGAAATCAT CCTACAAGCA 600 

AACTAACCTG GAATCCCTTC ACCTAGAGAT AATGTACAAG CCTTAGAACT CCTCATTCTC 660 

ATGTTGCTAT TTATGTACCT AATTAAAACC CAAGTTAAAA AAAAAAAAAA AAAAA 715 

20 

(2) INFORMATION FOR SEQ ID NO: 32: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 486 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) topology: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
GAGCCAGTGC CGGCGAAAGG GGACCTTCCT CTACTTCCTG CCACAGACCC TGTCCCCACA 60 
35 CACTTCCTGC CCCTGCTCTG CTGGGAGGCC ACTTCCTCCC CCAGTGCTGG ATTCCACCCC 120 
CAGCTCACCC TCAAACATGG CCCCCTCTCT CCTCCTGCTT GCCCCTCTCT GCTCCCTGGA 180 
GGCTGTTCTG TCCTCCCCTC TTGAAAAGCA ATGCCAGCTT CCTGGGATCT TCTGCCAACT 240 

40 

CCAGCTACCA TGCCCTTTGC TCCTGTCAGC TCAGCTCCTC AAGGGAATTG TCTAMCCTCG 300 
GTGTCCTGCT TCCCTCCCTC AACCTCCTCA CCCTGCTCCA AGCTGGCATC TGCCCCTCCA 360 
45 CTGCACAGAA CGGNTCCCCC ACCACCTGCC TTTACAGGGA GGAAGCAGCA ACATGGAAGA 420 
ANCGAACTAT AGGGGCTACA ANGATGCTCA GCTCTGATCC CGAAGGCAAA AAGNATCTTT 480 
GGGCAC 



50 
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55 



(2) INFORMATION FOR SEQ ID NO: 33: 



(it SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 725 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS: double 

60 <D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

^ GTTCCTCTGG TAATAATTAG GTTATTCCCA GAAGCACAGT GTCATTCTTT AAATAAAAGC 60 

TTTCCTGTTT AAAGCTTTTC AAAGGAGCAG ACCACCTTGA AGAITCCCCC TAGGGTTGAT 120 

ATGTGTCTAA TTCATTTTAT AAAAATTATT CTTGTCTTCA TITTAAAGCT TTGGCTATAT 180 

10 AGTCAGAAAT GTCCTAAATA ACAAACTATT TTGTATTTAA TTTAGGGAAG ACTAAAGGGA 240 

AGAAAAATGA AAACTCAGTC TTTATGTAAG CTCCAAGGAT ATTAGGGCTT AAAGGGCTTT 300 

^ TCTAGTTTTA TGAGAATTTG TACTACTGAT TTTTATATAT TCCT UTmT GATGAACAGA 360 

TCTCTGGGGA AATTGTTGAG TTACAATGGC ATTTCACTGT GATCCCTCTC AAGCTCAGAT 420 

CAGTTCTATA ACCCAATGAC AACCTGTCTC TTTGGTTTAC TGTCCTGTGA AATGTCAGCT 480 

20 CAAGTTTCCC AGAAGTCGTG TGTTTATGAT GAGTCAGAGT GCTTTTCCTC QGTGGGACAG 540 

TTGCTGGCCC TCTTAATTTT GGTGTATGTG CTTCCAAGTA TCTAAACCTC CAGTCTGATC 600 

TGTATATGCT ATCCTAACTG TTAATTGTAT TATTGA7TAT GTTGATTATC TTGCTTGAAG 660 

GTTCATACTT TTCAATTTGA TAGAAATAAA GTTTTTTTCT GCTTATAAAA AAAAAAAAAA 720 
AAAAA 



25 
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(2) INFORMATION FOR SEQ ID NO: 34: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 437 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

CACACAGCAT GCTGCCCTCA GACGTGTCCA TCCTGTACCA CATGAAAACG CTGCTGCTCC 60 
45 TGCAAGATAC TGAGAGATTG AAGCATGCTC TGGAAATGTT CCCAGAACAT TGCACGATGC 120 

CTCCTGCTTT TATTGGCTCT TGTCGAAATC AAATTGGAAG ATCTTCAGTC CCAGCTGCAC 180 
50 CC ^ CGTGGA AAAGTATTCC AGGTCCATCC CCAAGGAACC AACACCGATG ACATGGACTC 240 

AGGAATCTTA TAACCTACGT GGACTCTTTC CATCCGTACA TTGTCGTGCA CATGCCACTC 300 

ATCACCTGGC GTGCCCAGAT CCTCGCARGG CAACACCCTG TGATAATTCC AGGTGATTCT 360 
55 CTACATCTGC AGCTTGAGGT TAGCCTCATA TCACATTACA TTCTCACTAN AAACNAAAAA 420 

AAAAAAAAAA AACTCNA 437 

60 
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(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 943 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GGCACGAGCT GGAACAGAGA CTAAATCCCA CGAAACTGAC ATTGTTAAAC ACACTAAAAC 60 

AGAAGTACTT ACCTCTTGAA GATTTAATAT ATAATGGTTG ACATGATACA TGTACATGAT 120 

15 

GAATGACCAG ATGCTTATGG TCTACATTTT CCTTTATCCT GTTAGTATTA CCTTCCTTAA 180 

TCTTTGTTCA TTAACATGCT AATTCCTCTT CAGTGTTTAT TTTCTAGTGA CAGAATGCTA 240 

20 ACATTTCTTA CACCCTGGCA GAAGGGAGAG AAATGTGTTT TGGGGTGGGT AACTAAATTT 300 

TTGAGTGAAA TATCATAAGA TGANAATGGA AANAAGGAGA CACAAANAGT TATNACAAAA 360 

AAACAATGGT TTTTTTAGCC ATTTGACTGG CTCTTTAAAT AGTCTACAAG ACATTCACGT 420 

25 

TTAACATCAC TTTTAGTGAA ATAAAATGTG CCATACTAGT ATGTGCTTCA AAAGGGCAAA 480 

TGTGCTTTAG TGCCCTAAGG CTAAATTTTG GTCATTTGAC ATCAGAGATG TTGTAAGTAT 540 

30 TGCACTTAAT ACGCACCTAT TTNTCAATAG TGTTATTTTT TGGNTAGCAT TTTTTTTACC 600 

ACTATNTPGT TGATAGCTTT TTGTTCTOTN AGGTTGNAAN ATGACAGTGC TNATNTCAAA 660 

CAGATTACCC ATNTGCAGAA CTAAGGGAAG CNATTTATGT ATGAAAGNAA TTNTTGAATT 720 

35 

NGTCATTOTC AACCNTTGNA TTAAAGCTTA GACTAAATAG TAATATATOG TGGGNAGGAT 780 

TTTGGTTITG TGATATTTNT GTGNATTAAG GNATAGATGT TAACCNTTAT TTTGTAGNAA 840 

40 AGTGANTTGT ATGTGGTTAA TTATAAATAA AACTGGTACC AGGNAAAAAA AAAAAAAAAN 900 

NAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 943 



45 

(2) INFORMATION FOR SEQ ID NO : 36: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 604 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 36: 

GGCACGAGAA ATCTTCATGC TGTAGTCACT CCAGACCATG GAGTGGCTTT CCAGCTGAAT 60 
GAATCCTATG TCTCGCGTGC AGGTGGTTGG TTTTCAATGT TCTTGCTAAT TTTTTTTCTA 120 

60 
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TTGGATCTTG GGAGTTTTCT TTCTTTGCTC CTGTGTTTGC CCAGCTTTAA TAAAACCAGG 180 

CGCAAACAAA AACCATAGCA TTCTGAACAA TAGGGGGCCC ACATTGGACC CAGTATGTCA 240 

CTTTAATGGA CTTCAAGAAA AAATCTGAAT GGGAAAAATG ACACTAGGAA TGTATACTCC 300 

ACACATTTTA TGCCATATAA TGGTGTGTTT TCTTAATTTT GTTTCTTGTG GCGAAATGTG 360 

GCTTTCAAAT TAAAATGACC TTTTCTTCTT TGAAACTTTT TGTTTTGACT TGTATAATTA 420 

AGGGTTTGGA AAGATTCATA ATTCTGAGAG AGGTTTGCAA CCAGGAGATA CAAAGAAGTC 480 

TCAGTAGTAA TCTTGTTCAT GTGCTTTTAC AGGCAGCTAC ATTTAAGGAT GTATTAGTTA 540 
15 CAGAAATTAT ATGTCTGTGT ATGTGTCTCT ACTCAATAAA GTACATGCCT CCACAAAAAA 
AAAA 
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(2) INFORMATION FOR SEQ ID NO: 37: 



(2) INFORMATION FOR SEQ ID NO: 38: 



600 
604 



(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 349 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GTGAGTGCCC GGGAGCCCCG AGGCCCTGCC CCTAAGAAGG ATATCTYTRA CCGCTCCCTT 60 

^ GTCCACACCC TAACCCCCCA GCTGCTCAGG CAGTGGGCAC ATGGCAGGGG CCTCACTGGG 120 

GGCACATAGA GCATTTGGGG GACTGCGAGT GCTCACCTTT GACTTCCTGC AGGTCGGGGG 180 

AAAACCAGAT CATGATGACC AAAGTYTACA TATTCTTGAT CTTCATGGTG CTGATCCTGC 240 

40 CCTCCCTGGG TCTCACCAGG TATATGCCAC CACYTTCTGY TCTAAATTCA GAATAAGAGT 300 

CACATCAGGA GAGCACTGTC CCCAGGANAA TGCAAACGGG TTGGCAGCA 349 



(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 672 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

GTAGTCGTTG CGGTTGCCGG GATGGCGAAG ATCTCGCCGT TTGAAGTCGT AAAACGCACC 60 
TCGGTACCGG TGCTTGTTGG TTTGGTGATT GTWATCGTTG CTACAGAGCT GATGGTGCCA 120 

60 
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GGAACGGCAG CAGCGGTCAC AGGCAAGTAA ATAGTAATGC CGGAGCAAGT TTCCTCCGGC 180 

TTTATCATGT CACCCACTGT GGTATATGCG TTGTGGTCTG CCAACTTTGC CGTGAACAAT 240 

TTCAGCAATA ATCAGATGGC GGCTGGCGCA ATATTCAAGA TAACGCCTGG CAGTGGTGCG 300 

GCTGATGGTT CAGTGCCTGC GSCACCGTTT YTGCCGTATG TTGCACACCA GGNTCTTTAA 360 

ACAGTTTTCG SACCGCGTTT AGCGTCAAGG GTTCAATGCC GGTCGGTAGC TCGTCCTTAG 420 

GTTCACCGCG AGCATAAGCA TTAAACATCT CATCAATTTG CTTCTGGCTC GCGCTATCAA 480 

TACTTTCCAG CATATGTTTA CGCTGGCGGA AACGGGTTAG CGTTTGCCCC ARCMGWTCAT 540 

15 AGGCAATGGG CTTAATGAGA TAATCAAATA CACCACAACG TACGGCITCA GACACCGTTT 600 

CCATATCGCT GGCTGCAGTG GTAAACACCA CGTCGCCGGG ATAATGCGCC TGCACCAGTT 660 

CATGCAGTAA AT 672 

20 



25 



(2) INFORMATION FOR SEQ ID NO: 39: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1908 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

AGAGTTGATA TTTTTAGAAA CAGTAATTTT ACTTTTAAGG AAATTGGCTA GCTCTTTGAC 60 

35 

TNNAGAGCTG TAGGAAGCTC AACATTTCTT TGTAGAGAAC GTTGCTTTTT TTGGATTGTA 120 

CAGGTATAAA AACATTGCTT TTGTTGAATT GTATAGGTGT AAAAAGGGAA TAACTGTATG 180 

40 CAGGTTTGAA AAGGAAATGT GCTTTAGGCA TGAGTCATAA GATGCCATTG TACTTGTAGG 240 

CATTTTATTT TCCTTTAGAA ATGGACATCA GCTCTTCTCT TCTGACTGGT AACACATAGC 300 

CCCAAAGCAT GAGATTATTT TTCATTGGGT TTTTATTGTT GTTTAGTTTT GGTTTGTTAC 360 

45 

GCCAGCCCAG TCTGTCTGCG GAACACTGAC TCTGCTCTCT AATGAGAACA AAGTTAGAAA 420 

TCTGCCGATA ACCTAAAATA ATTTAGAAAT GAATTAAAAA TGTGAAATCG GGTTAAAGTG 480 

50 ATGATGATAA AATAGCATGC AAGAAACAAG CTCCTTCCAT CAGACTTGGC TACTGTTTTC 540 

TTCTGGTACG ATTTGGTTTG GAAGAGCCTC TTGTTTCCTT CTCTTTGGGG TATGTCTTCG 600 

TTTCTTAATA TGTTTGTAAC ATTATTGAGA TATAATTCAC ATACCTTACA ATTCACTTAT 660 

55 

TTTAAGGGTA CAATTTAGTG GTTTTTAGTG TATTCACAAA GTTGTGTAAC CGTGACCACA 720 

GTCAATTTTA GAACATTTCG TTACCCCAAA AAGAAACCCT GTACCCTTGA GCAGTCACCT 780 

60 CTCATTTTCT CCCAGTGCCC ACCCCATCCC CGAGCCCCKG GAACCACTAA TCTATTTCTC 840 
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TCTCTGTAGA TTTGCTTATT CTGGTCATTT CATATAAATG GAATTCTACA ATATTCGGTC 900 

TTTTGGGACT GGCTTCCCAA ATATGATTTT CTATATGGAG TGAGAAAATT CTTCTCATCT 960 

TGAGAACTCT TATTGCTGTG AAAGGGAGTG GTTGGTAAAA TCAATAGATT TCAGGCAAGA 1020 

GGGCCAGATA CCTAACAGGT TTTTCTCCGT GAATCTTATG CTGAGTAGTT TTTCCTCATA 1080 

ACCAAGCATT TATGATATAT TACTACTTAT AATACTGTGG CTAGTCTCTA GAATGGATGT 1140 

TGAAATCTTT GCCTCCTCAG TCGGGAAGAG TCCTGCTAAA. AATCAGGCTA AAAATCAGGC 1200 

CAAAAATCAG GCCAAATGAC TTGGCAAATA ATTGACAAAG TGGTTTTCAC GTGTCTCTAT 1260 

CTTTGCTAGC AGCTTGTATA CCTCAGGCCA GGTGAGCTCC CCAAATTTCT TTTTTCATTT 1320 

ACTCCAGTGA GTTTCTGCTG TCTTTTTCAA GTATGTACCA TAGGACTTAA AGGTGATTTG 1380 

GATGCGTTGT AACACTGCTA AATATGCTAA GTACAGAATT TTATCTACAG TACTGTGAGA 1440 

CAGTCAATTA TTGCCTAGGG TAGTTCAAAA ATATGATGTG AGCTAGTTAA GCCTTTGCTT 1500 

GACTGATTTC AGTGATATTC AGAAGTGTGT ACCAATCAAG GCTCTTTAAA ATACGGAACG 1560 

ACTCACTTAA TAACCAGGGA ACCAGCCAAA TACTGTGCAG CCGCAGAATA TGCATATCAA 1620 

TGAGTTGGAG GTGATTATTC TCTGTAACTC CCTAATGATT GTTTTCTAAG CATTGTQGCT 1680 

TCTCAGTGGC TTGACAGCAT CTTCCTGGTT GTATGTGGCC TGTTTACATG ATGTATTGAA 1740 

TAATGTTGTT TGTTGTGAGC ATCAATGCCT GTAACACCAA ACTAAACACG TGTTTTTGGG 1800 

ATATGTTTCC AATCTTTAAA TGACCTTGCC CTGTCCAATA AATAAATGAT TGTCTCACCC 1860 

TGTTAAAAAA AAAAAAAATT AAAAAAACTG GGNGGGGGGC CCGGTACN 1908 



40 



45 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 458 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
CCTCAAAAAA AAAAANGAAA GGAAAGAGGT CTCTACACAA GCCCGTGATT 
AGGGATAACA TCAGAAATGT TTCATTTYCK GCTATTAGTT TCCATTCCTT 
GGCATAAAGA GAAACAAAAG ACAATGATGG TATTCTCTGT GTCCTCAGCT 
TGTTGATGTT GCTAAGGAGC AGTGACCTTG CTAAAAAGAC TGAATAATCC 
TAGCTAACCT GGGGAGGAAA TGAAAATTTC CTTTGTGGAT CTCCCCAAAT 



CTTCATGGCA 
TCCCCATCCA 
TTGGCACTTT 
ACCCACTGAA 
CCATTGTTGT 



60 
120 
180 
240 
300 
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CACCAGGCCC TCCCAGAACC TCCTCAGTTC CTTCACAGTG CAACCCTGTC TACTTGGCCC 360 
GCAACCCAAT AGTATTGTGC CTCACTTCAC CTTCCATGGG CAACTGCCCT CCCTTCTGGA 420 
CATAAAACCT CATATTTTAA ATNAAGTTGA AATTTGAA 45 g 



10 (2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1153 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

20 GGCACAGAGC CTCCGACCCA GGTGGTCTGG AGCCTGCCGG GAGAGTGGTC GCATCTGAGA 60 

GGCTGGTCGT GGACTGTGGT TGGGGGAGGT GGGAGCTGTT TTAACCGTGT GCCCCCTCTC 120 

^ CTGTGCCGGC GTGGGCATCC CCCGGGGCAG TGGAACCCGG GCGCTCCTCC AGCTTCCGAG 180 

TCCAGCCAGC CTGGGCGCGG GGCGCGCCCC GAGACACCCG AGGAGTCCGT TCCTCCCTGG 240 

TTACGTGGAC TGTGGAGCTG GTCTCTTGTG GCTCAGCGCC GTGCGGAGGT TGAAGCGTAC 300 

30 CTGCGGAGGT CGCACCAGGG CGTGAGGAGG AGGAGGAAGG GCATGAGCCG AGCTTGAGGA 360 

ATCCGTGCTC CAAACTCTAC ACTCAAGGAT GCACTGCGCA ACTCTGGTGG CGATGGGCTG 420 

^ GGGCAGATGT CCTTGGAGTT CTACCAGAAG AAGAAGTCTC GCTGGCCATT CTCAGACGAG 480 

TGCATCCCAT GGGAAGTGTG GACGGTCAAG GTGCATGTGG TAGCCCTGGC CACGGAGCAG 540 

GAGCGGCAGA TCTGCCGGGA GAAGGTGGGT GAGAAACTCT GCGAGAAGAT CATCAACATC 600 

40 GTGGAGGTGA TGAATCGGCA TGAGTACTTG CCCAAGATGC CCACACAGTC GGAGGTGGAT 660 

AACGTGTTTG ACACAGGCTT GCGGGACGTG CAGCCCTACC TGTACAAGAT CTCCTTCCAG 720 

ATCACTGATG CCCTGGGCAC CTCAGTCACC ACCACCATGC GCAGGCTCAT CAAAGACACC 780 

45 

CTGCCCTCTG AGCGTCGCTG GATCTCTGGG AGCTCCTTGA TGGCTCCCAG ACCTTGGCTT 840 

TTGGGAATTG CACTTTTGGG CCTTTGGGCT CTGGAACCTG CTCTCGGTCA TTGGTGAGAC 900 

50 TTGGAAGGGG CAGCCCCCGC TGGCTTCTTG GTTTTGTGGT TGCCAGCCTC AGGTCATCCT 960 

TTTAATCTTT GCTGACGGTT CAGTCCTGCC TCTACTGTCT CTCCATAGCC CTGGTGGGGT 1020 

^ CCCCCTTCTT TCTCCACTGT ACAGAAGAGC CACCACTGGG ATGGGGAATA AAGTTGAGAA 1080 

CATGAGTTTG GGCTGAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1140 

AAAAAAAAAA AAA 1153 

60 
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(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1983 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

GGCACGAGAG GGGCCGAGCC GACAAGATGT TCTTGCTGCC TCTTCCGGCT GCGGGGCGAG 60 

15 TAGTCGTCCG ACGTCTGGCC GTGAGACGTT TCGGGAGCCG GAGTCTCTCC ACCGCAGACA 120 

TGACGAAGGG CCTTGTTTTA GGAATCTATT CCAAAGAAAA AGAAGATGAT GTGCCACAGT 180 

TCACAAGTGC AGGAGAGAAT TTTGATAAAT TGTTAGCTGG AAAGCTGAGA GAGACTTTGA 240 

20 

ACATATCTGG ACCACCTCTG AAGGCAGGGA AGACTCGAAC CTTTTATGGT CTGCATCAGG 300 

ACTTCCCCAG CGTGGTGCTA GTTGGCCTCG GCAAAAAGGC AGCTGGAATC GACGAACAGG 360 

25 AAAACTGGCA TGAAGGCAAA GAAAACATCA GAGCTGCTGT TGCAGCGGGG TGCAGGCAGA 420 

TTCAAGACCT GGAGCTCTCG TCTGTGGARG TGGATCCCTG TGGAGACGCT CAGGCTGCTG 480 

CGGAGGGAGC GGTGCTTGGT CTCTATGAAT ACGATGACCT AAAGCAAAAA AAGAAGATGG 540 

30 

CTGTGTCGGC AAAGCTCTAT GGAAGTGGGG ATCAGGAGGC CTGGCAGAAA GGAGTCCTGT 600 

TTGCTTCTGG GCAGAACTTG GCACGCCAAT TGATGGAGAC GCCAGCCAAT GAGATGACGC 660 

35 CAACCAGATT TGCCGAAATT ATTGAGAAGA ATCTCAAAAG TGCTAGTAGT AAAACCGAGG 720 

TCCATATCAG ACCCAAGTCT TGGATTGAGG AACAGGCAAT GGGATCATTC CTCAGTGTGG 780 

CCAAAGGATC TGACGAGCCC CCAGTCTTCT TGGAAATTCA CTACAAAGGC AGCCCCAATG 840 

40 

CAAACGAACC ACCCCTGGTG TTTGTTGGGA AAGGAATTAC CTTTGACAGT GGTGGTATCT 900 

CCATCAAGGC TTCTGCAAAT ATGGACCTCA TGAGGGCTGA CATGGGAGGA GCTGCAACTA 960 

45 TATGCTCAGC CATCGTGTCT GCTGCAAAGC TTAATTTGCC CATTAATATT ATAGGTCTGG 1020 

CCCCTCTTTG TGAAAATATG CCCAGCGGCA AGGCCAACAA GCCGGGGGAT GTTGTTAGAG 1080 

CCAAAAACGG GAAGACCATC CAGGTTGATA ACACTGATGC TGAGGGGAGG CTCATACTGG 1140 

50 

CTGATGCGCT CTGTTACGCA CACACGTTTA ACCCGAAGNT CATCCTCAAT GCCGCCACCT 1200 

TAACAGGTGC CATGGATGTA GCTTTGGGAT CAGGTGCCAC TGGGGTCTTT ACCAATTCAT 1260 

55 CCTGGCTCTG GAACAAACTC TTCGAGGCCA GCATTGAAAC AGGGGACCGT GTCTGGAGGA 1320 

TGCCTCTCTT CGAACATTAT ACAAGACAGG TTGTAGATTG CCAGCTTGCT GATGTTAACA 1380 

ACATTGGAAA ATACAGATCT GCAGGAGCAT GTACAGCTGC AGCATTCCTG AAAGAATTCG 1440 

60 
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TAACTCATCC TAAGTGGGCA CATTTAGACA 
TTCCCTATCT ACGGAAAGGC ATGACTGGGA 
5 TTCGTTTCAG TCAAGACAAT GCTTAGTTCA 
AATTGGACAG TTGAACTTAA AAGGTTCTTG 
CAAAGGATGG TATTTAAAAA TGTAGAACAC 

10 

CATTTCACAC AAAGATTTAT AAAGGTAAAG 
ATACTCTATA AATGATTAAA ATTTTTAGAA 
15 TTCATTGAGA AGCAAAATTG TAACTCAGAT 
AATTACTATG CACTTGTCAG AAACAATAAA 
AAA 

20 



TAGCAGGCGT GATGACCAAC AAAGATGAAG 1500 

GGCCCACAAG GACTCTCATT GAGTTCTTAC 1560 

GATACTCAAA AATGTCTTCA CTCTGTCTTA 1620 

AATAAATGGA TGAAAATCTT TTAACGGAGA 1680 

AATGAAATTT GTATGCCTTG ATTTTTTTTT 1740 

TTAATATCTT ACTTGATAAG GATTTTTAAG 1800 

CTTCCTAATC ACTTTTCAGA GTATATGTTT 1860 

TTGTGATGCT AGGAACATGA GCAAACTGAA 1920 

TGCAACTTGT TGTGCAAAAA AAAAAAAAAA 1980 

1983 



25 



(2) INFORMATION FOR SEQ ID NO: 43: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1406 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

ATGATGATGA CTTTGAAGAC GATTTTATTC CTCTTCCTCC AGCTAAGCGC CTTGAGGTTA 60 

35 

ATAGTTGGAA AAGACTCTAT AGATATTGAC ATTTCTTCAA GGAGAAGAGA AGATCAGTCT 120 

TTAAGGCTTA ATGCCTAAGC NCTTGGTCTT AACTTGACCT GGGATAACTA CTTTAAAGAA 180 

40 ATAAAAAATT CCAGTCAATT ATTCCTCAAC TGAAAGTTTA GTGGCAGCAC TTCTATTGTC 240 

CCTTCACTTA TCAGCATACT ATTGTAGAAA GTGTACAGCA TACTGACTCA ATTCTTAAGT 300 

CTGATTTGTG CAAATTTTTA TCGTACTTTT TAAATAGCCT TCTTACGTGC AATTCTGAGT 360 

45 

TAGAGGTAAA GCCCTGTTGT AAAATAAAGG CTCAAGCAAA ATTGTACAGT GATAGCAACT 420 

TTCCACACAG GACGTTGAAA ACAGTAATGT GGCTACACAG TTTTTTTAAC TGTAAGAGCA 480 

50 TCAGCTGGCT CTTTAATATA TGACTAAACA ATAATTTAAA ACAAATCATA GTAGCAGCAT 540 

ATTAAGGGTT TCTAGTATGC TAATATCACC AGCAATGATC TTTGGCTTTT TGATTTATTT 600 

GCTAGATGTT TCCCCCTTGG AGTTTTGTCA GTTTCACACT GTTTGCTGGC CCAGGTGTAC 660 

55 

TGTTTGTGGC CTTTGTTAAT ATCGCAAACC ATTGGTTGGG AGTCAGATTG GTTTCTTAAA 720 

AAAAAAAAAA AAAACGACAT ACGTGACAGC TCACTTTTCA GTTCATTATA TGTACCGAGG 780 

60 GTAGCAGTGT GTGGGATGAG GTTCGATACA GNCGTATTTA TTGCTTGTCA TGTAAATTAA 840 
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AAACCTTGTA TTTAACTCTT TTCAATCCTT TTAGATAAAA TTGTTCTTTG CAAGAATGAT 900 

^ TGGTGCTTAT TTTTTCAAAA ATTTGCTGTG AACAACGTGA TGACAACAAG CAACATTTAT 960 

CTAATGAACT ACAGCTATCT TAATTTGGTT CTTCAAGTTT TCTGKTGCAC TTGTAAAATG 1020 

CTACAAGGAA TATTAAAAAA ATCTATTCAC TTTAACTTAT AATAGTTTAT GAAATAAAAA 1080 

10 CATGAGTCAC AGCTTTTGTT CTGTGGTAAC CTATAAAAAA AGTTTGTCTT TGAGATTCAA 1140 

TGTAAAGAAC TGAAAACAAT GTATATGTTG TAAATATTTG TGTGTTGTGA GAAATTTTTG 1200 

^ TCATAAGAAA TTAAAAGAAC TTACCAGGAA GGTTTTTAAG TTAGAAATAT TCCATGCCAA 1260 

TAAAATAGGA AATTATAAAT ATATAGTTTT AAGCCTGCAT CAGTGGGAGT CTTGGCTATG 1320 

TAGTTATCTA GTTATTATGN AACCACCAAG ATTTTTTTGG CTATTTACCG TAACCAAAGG 1380 

20 GGCCGATTAA NTGGTTTGAA GNCTTG 1406 



25 (2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1391 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

35 GGGCCTGAAG GCGGCRCGCC AGTCCCGAGC AGTGCTCGCT CCTGCTCGGG GCGCTGCGGC 60 

CCCGGGCGTC GCCATGACCA GTGAGCTGGA CATCTTCGTG GGGAACACGA CCCTTATCGA 120 

CGAGGACGTG TATCGCCTCT GGCTCGATGG TTACTCGGTG ACCGACGCGG TGGCCCTGCG 180 

40 

GGTGCGCTCG GGAATCCTGG AGCAGACTGG CGCCACGGCA GCGGTGCTGC AGAGCGACAC 240 

CATGGACCAT TACCGCACCT TCCACATGCT CGAGCGGCTG CTGCATGCGC CGCCCAAGCT 300 

45 ACTGCACCAG CTCATCTTCC AGATTCCGCC CTCCCGGCAG GCACTACTCA TCGAGAGGTA 360 

CTATGCCTTT GATGAGGCCT TTGTTCGGGA GGTGCTGGGC AAGAAGCTGT CCAAAGGCAC 420 

CAAGAAAGAC CTGGATGACA TCAGCACCAA AACAGGCATC ACCCTCAAGA GCTGCCGGAG 480 

50 

ACAGTTTGAC AACTTTAAAC GGGTCTTCAA GGTGGTAGAG GAAATGCGGG GCTCCCTGGT 540 

GGACAATATT CAGCAACACT TCCTCCTCTC TGACCGGTTG GCCAGGGACT ATGCAGCCAT 600 

55 CGTCTTCTTT GCTAACAACC GCTTTGAGAC AGGGAAGAAA AAACTGCAGT ATCTGAGCTT 660 

CGGTGACTTT GCCTTCTGCG CTGAGCTCAT GATCCAAAAC TGGACCCTTG GACCCGTCGA 720 

CTCACAGATG GATGACATGG ACATGGACTT AGACAGGAAT TTCTCCAGGA CTTGAAGGAG 780 

60 



WO 98/39448 



268 



PCT/US98/04493 



CTCAAGGTGC TAGTGGCTGA CAAGGACCTT CTGGACCTGC ACAAGAGCCT GGTGTGCACT 840 

GCTCTCCGGG AAAGCTGGGC GTCTTCTCTG AGATGGAAGC CAACTTCAAG AACCTGTCCC 900 

5 GOGGGCTGGT GAACGTGCCG CCAAGCTGAC CCACAATAAA GATGTCAGAG ACCTGTTTGT 960 

GGACCTCGTG GAGAAGTTTG TGGAACCCTG CCGCTCCGAC CACTGGCCAC TCAGCGACGT 1020 

GCGGTTCTTC CTGAATCAGT ATTCAGCGTC TGTCCAATCC CTCGATGGCT TCCGACACCA 1080 

10 

GGCCCTCTGG GACCGCTACA TGGGCACCCT CCGCGGCTGC CTCCTGCGCC TGTATCATGA 1140 

CTGAGGTGCC TCCCAACGTC CGCCCACGCT GACAATAAAG TTGCTCTGAG TTTGGAGACT 1200 

15 GGTCCTCGCT CCGGGGAGCA AGTGGGGGGC GTGCAGATGT GCCTGTGTCT GTCTCTGAGC 1260 

ACCTGGTGTC CGTGTACAAG GATGGATGTG TOCNGTGGCT CCTTGGGAAC TGAGACATAT 1320 

CTCAGGGAAT GGTGTCTGTG CTCAGCCCAT CCACCAGAAG AGTCTGCTCA CAAAAAAAAA 1380 

20 

AAAAAAAAAA A 1391 



25 

(2) INFORMATION FOR SEQ ID NO: 45: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1569 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 45: 

35 

GGCACGAGTG GAGATGGCTG CGGCCGTGGC GGGGATGCTG CGAGGGGGTC TCCTGCCCCA 60 

GGCGGGCCGG CTGCCTACCC TCCAGACTGT CCGCTATGGC TCCAAGGCTG TTACCCGCCA 120 

40 CCGTCGTGTG ATGCACTTTC AGCGGCAGAA GCTGATGGCT GTGACTGAAT ATATCCCCCC 180 

GAAACCAGCC ATCCACCCAT CATGCCTGCC ATCTCCTCCC AGCCCCCCAC AGGAGGAGAT 240 

AGGCCTCATC AGGCTTCTCC GCCGGGAGAT AGCAGCAGTT TTCCAGGACA ACCGAATGAT 300 

45 

AGCCGTCTGC CAGAATGTGG CTCTGAGTGC AGAGGACAAG CTTCTTATTG CGACACCAGC 360 

TGCGGAAACA CAAGATCCTG ATGAAGGTCT TCCCCAACCA GGTCCTGAAA GCCCTTCCTG 420 

50 GAGGATTCCA AGTACCAAAA TCTGCTGCCC CTTTTTGTGG GGCACAACAT GCTGCTGGTC 480 

AGTGAAGAGC CCAAGGTCAA GGAGATGGTA CGGATCTTAA GGGACTGTGC CATTCCTGCC 540 

GCTGCTAGGT GGCTGCATTG ATGACACCAT CCTCAGCAGG CAGGGCTTTA TCAACTACTC 600 

55 

CAAGCTCCCC AGCCTGCCCC TGGTGCAGGG GGAGCTTGTA GGAGGCCTCA CCTGCCTCAC 660 

AGCCCAGACC CACTCCCTGC TCCAGCACCA GCCCCTCCAG CTGACCACCC TGTTGGACCA 720 

60 GTACATCAGA GAGCAACGCG AGRAAGGATT CTGTCATGTC GGCCAATGGG AAGCCAGATC 780 
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CTGACACTGT TCCGGACTCG TAGCCAGCCT GTTTAGCCAG CCCTGCGCAT AAATACACTC 840 

TGCGTTATTG GCTGTGCTCT CCTCAATGGG ACATGTGGAA GAACTTGGGG TCGGGGAGTG 900 

5 

TGTTTGTCAC TTGGTTTTCA CTAGTAATGA TATTGTCAGG TATAGGGCCA CTTGGAGATG 960 

CAGAGGATTC CATTTCAGAT GTCAGTCACC GGCTTCGTCC TTAGTTTTCC CAACTTGGGA 1020 

10 CGTGATAGGA GCAAAGTCTC TCCATTCTCC AGGTCCAAGG CAGAGATCCT GAAAAGATAG 1080 

GGCTATTGTC CCCTGCCTCC TTGGTCACTG CCTCTTGCTG CACGGGCTCC TGAGCCCACC 1140 

CCCTTGGGGC ACAACCTGCC ACTGCCACAG TAGCTCAACC AAGCAGTTGT GCTGAGAATG 1200 

15 

GCACCTGGTG AGAGCCTGCT GTGTGCCAGG CTTTGTGCTG AGTGCTGTTA CATGTATTAG 1260 

TTCCTTTACT GCTGACCACA TTGTACCCAT TTCACAGAGA AGGAGCAGAG AAATTAAGTG 1320 

20 GCTTGCTCAA GGTCATGCAG TTAGTAAGTG GCAGAACAGG GACTTGAACC AAGCCCTCTG 1380 

CTCTGAAGAC CGCGTCCTGA ATTTCTTCAC TAGAGCTTCC TCATCAGGTT ACCCAGAAGT 1440 

GGGTCCCATC CACCATCCAG GTGTGCTTGG ATGTTAGTTC TCCACCCTCG AGGTGTACGC 1500 

25 

TGTGAAAAGT TTGGGAGCAC TGCTTTATAA TAAAATGAAA TATATTCTAA AAAAAAAAAA 1560 

AAAAAAAAA 1569 

30 

(2) INFORMATION FOR SEQ ID NO: 46: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1924 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
GGGCCCCCCC WCGWKTTTTT TTTTTTTTTT TTTAATTAGG ATAATGCCTT TATTAACGAG 60 
45 AATGAAACGT TCATTCCTCC TTCCACTCCT TCTCGTTGGT TTTCTGGACA CAGCTCACCT 120 
GATCCTGCTA GAAACGTTGT CAGTCTGCTT GTGGCTTCCC TCCITGATTG ACTCACGCTG 180 
TGTGATGTCT TGAGAAGTAT CTATCCACTT CATGTGAATG AGCACTCCAA TATCAGCCAA 240 

50 

CATCAATCAT TCTTACCTAA AGAATAATAA GAAAAAGTTA ATATAAAAGA CAAGGGTATA 300 
AAATAAAGGT TTGAAAATGC TAGTCAACTT CAAAATTTAA AGAGTAAAAA TCCAGAGATA 360 
55 AAGATTGGGG GTAAGTTACA GCATAAAAAA ATAGGAAGAA ACTTCATGGT GGGGGGGAAA 420 
TCTAAAATTA TTCTTACATA AAATAAGTAG ACACCTGAAT TAGAATGAAA ACTCTATITT 480 
CTTTAAAATG TAAAAGCCTG ACTCTCAGTT TCACCAGTCT GAGCACAAGT TTGACTGCAA 540 

60 
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30 
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40 



45 



CCCAAAATAT ACTATCCCTT ATGTGAAGGT ATGTGACAAC GTTGACCTCA CCAAATGAGT 600 

TTTAACATCA GCTCTTTTTT. CATATGAAAG CACATACCCT GCTCCCCATT CAAGTATGTC 660 

TTCCATTGTC AGGCAGGCTC ACCACCTTCA GCAGGAGTCC TCCAAGAGTG CCCAACTCCC 720 

CTTCCCACAG TACACAACGC TGTAGTTGTT GTCCTGCAAT CCTTTGTATT TACCTCATTC 780 

TTTCCCATCT AAGTCCTCAC TGAGTTTTAA AGTTAGGGCT GGAAAAGCTA TGCCTTACTG 840 

GGACAGCAAG GAACCAATTT TTTTCTGAGG GAGAAGACAT TCACCTTCAC TATATGCCTG 900 

GCAGGGCCAC AGTGCACAAA ACAAAGATCA GCCTTCATTC AAGTTCCAGG TTTTTCTTCC 960 

TCCCTGAATG ATTACTGCAA AGGGTATATG AAGTAAGAGT TCCCTGTTGC ACATGTACCA 1020 

TCCATAAGGG ATACTATATC GTTTTGCATT CTTCCCCCCA TTCTCCACAT TGTCCTATCT 1080 

TAAGTCCAAG CCCTTTTCAC TCTCAAAAAA AAAAAAAAAA TATTTTTTTC AGCACTGGTG 1140 

TTCAAAAGCA ACGTTTTTAT GGTTAATGGT TTACCAGCAA CTGTTGAGAT TTCCAGTTGA 1200 

GTCTTAAAAA TTGCCAATCA TTATCTAGCA GCAATGACAG ATGATTAGGA GCAGTCAAAT 1260 

CCTCTGAATT CTTTCCCTAA TAGGCAGCCA TTTGAGAACT GCACTAGCTG ACATCACTAA 1320 

AACATTATCA GCTAAAGCCA AAACCAAATA AAGGCCCAGA CCAACATCCT GGCTCTCTAA 1380 

AACCTGTCCA AAATCATTAA GTGAAAGGCA GTAAATGCAG GACTGTGGAT CATGTCACTG 1440 

CAGCTGACAA TGATTAACAA TAGGAGACAT GCAACCCCCA TTAAGGTTAA AAGTCCAAAA 1500 

CTAGTCACAC GCATCTCTTT ATTGGGGAAA AGTGAGACTA TTATGCATTC TTGGTAGGTT 1560 

TGCAACCTTG CATGAAGAGC ACCCATTGCA TTTCTTTCAT CTTTCAGAAA GCACCGGTAT 1620 

CTCTTCCAAG GGCCTAACAG TACGAAAATA CATTCTGGCA TCACACCTCT GAACCCAAGA 1680 

CTGTTCTCAT TAAAAATAAT TTTGGTTTGT AACAAAATTA TGAAATACAA TGCAAGCACC 1740 

TCGGTATAGC ATTATTACTG AAACCACTTA ATTCCCAGCT TTTTGAGTTT TTTAAAAAAA 1800 

CCCACTGCAC TAAGATTCAC AATTCATTGC TACATACAAA TTAAAGCTAG TAAGAACACA 1860 

CTAACGTCAC AAGTTTCTCA TTCTAAAGTG CAAAAGCCTA ATCATCTGAA AGTGAACAGG 1920 

GTAA 1924 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 475 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
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TGCTGTGGGG CCCAGAAAMC AAGGGACCA3 TGAAAACAMC CCCAGAGACT TGTATCCGCC 60 



10 



15 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 346 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 



30 



40 



45 



50 



55 



GCGCACATGA AGACCAAAGC CAGGACCAAG CCCCMASCCT GCTWAACACG GCAGARTCTT 



120 



AGGAAAGCCA TTGCCAMTYC TGAGCCCTTG AAGGGCAAGG AGGGAAACAG TGTTACCAGA 

GCCCAGTAAG AACTGCTGTC ATGAAGGAGG GGCCACCTTG TAAGAGACAT CATTACTACC 180 

AGAACTGTGG TGCCAAATTG CTCGTGTCTC TCTTTGGAGA AACCAACCAG ATACATCTGC 240 

TGGAGACCCA GGTGGGCACA GAGAAGGGTG GAGAGAGAAT CTGGGAAGAG AAATGGAGAA 300 

TAAGCAGCAC AGTGTTATTC ATTTCTGTAA ATTCCTATGT AGAAGGCTCA GTGTTAGAAA 360 

TAAAGTTATT CTACTAGTTG CAAGTTAAGT GTTTCTGTTT GTTCTGCTTT CCTGTTAGCA 420 

TAAGTAAACT CCCTTTGGAA CTACACAGGT ATGTCTCTCC TTCAACATGT GTGAA 475 

(2) INFORMATION FOR SEQ ID NO: 48: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
AAGGGACAGA GACCTGGATT CAGATCTCAT TTTACAATGA AGACCCCAAT GCAGAAAGTC 60 
ATGTCTGAAA TTCTGAGCTT ACTCTTCTGC CTGCTGGGAC CTGCTCTGGA TGAGAGAAGG 120 
35 GAGGAAAAGG ACTAATCAGA GGAGCCAATG AAGTCACTCC ATGAGTTTCC TGAACCCTGC 



180 



CCAGCTAGAG ATTAACGTYT GACCWTCAAC GTAGGACACT GTGCAGATGG CTACTTGCTG 240 



300 



GCCCAGCCMA CYTCTGTGAR AATCTGCTTC CCTCCACAGC TGACCC 346 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1366 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
TAGGTGTCAG CCGCCACCCC CCCCCCATAT GCAGATTTAC TSGGCATGGT AGTGGCCAGC 60 
TTCTAACACA GCTGGTATTT CAAGTCTCCT GGGACCTCAC TCAGGAATGA TACCCCCTCA 120 
60 GTAGAAGCAG CAGGTGATCT TAACTCCTTT CAAAGAGCAG GCCTGTCTGG GAAGCCATC? 



180 
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10 



15 



20 



25 



30 



35 



40 



CCTCAGCA3G CACAGCAACC CCTCTGGAAA TGGATCACAA ACTCACTTCT CAGCCAGGCA 240 

GGCCAAGCTT CTATTGTAAC AGTAGGCACA GTATAGTCGG ATCATCACAT CAGCTGGGTT 300 

TTTGGTTTAG TCATCTAGAG TCGTCTGGAC TAAAGGTCTT TCAGGTCTCC TTGCCCTGTG 360 

AGTGCGTGAA CCTCCCCACC CGAATTGCCT CAGTTGTCCT GAGCCTCATG TCTCTCCTGG 420 

TGGTGGGCCA GGCCCCTGCA TGGGAAGGGA GCCTGCTGCG GGGCAGGCCA GCTGGGGGTG 480 

CTCACCTATG CGCAATGANA GTTATTGAAG GACTGGTTGT TGATGTTGGT GAGCGTATCC 540 

TTCATGGCCA GCGCGAAGTC GGCCAGGTCA GCCAGGTGCT GCCAGCGCTC TCTCTCGGAC 600 

TTGTCTTCCT GTGCCAGGGG ACCGTGGAGA AAGTGTCAGG GGCCGCTCAC TGCAGCAGCC 660 

TGCTCTGCTG CCTTCCCTGG CAGTGTTCTG GGGGTGGATT CCCTACAMCT AGATGTTCAA 720 

GGCCTTACTT TTCCTCCCAC AAAGGAGTCG CAGCCACGCT AGCTCTGACT TGCCACTGTG 780 

ACAAAGTTCA CGTAGCAGGT CTAGGCAAAG ACTGGGCAAT TGAGCAGAGG AGACGGACCT 840 

GTGAGTCTGA CCRYGAGSCG GRCCCCTTCA CCTTGGCTGG GCTGGTCCTG GTCCTTAGGT 900 

1TTGTCAGGT TGTCCTTGTT TGGATCCCTC AACTAGGTGA TAAGCACTGG AGGGGGATGA 960 

CCCGCCTTGG ACGTGTTTCT TTAACCTCAT CCATATAATA GGGCCGTGGG ATGGTTGTAG 1020 

AGGTAAAGCA GGATGATGGT GTTTTAAGAC CAGAGCTTGG GACCAGGGCT CCTACACCTA 1080 

ATTTTCTCTC CTGGTAGCTG AACAAAGGTC TAAATTAGCT TAACAAAAGA ACAGGCTGCC 1140 

GTCAGCCAGA GTTCTGAAGG CCATGCTTTC AGTTTCCCTT GTTGACAATT GCTCTCCAGT 1200 

TCCTATGAAA GCACAGAGCC TTAGGGGGCC TGGCCACAGA ACACAACCAT CTTAGGCCTG 1260 

AGCTGTGAAC AGCAGGGGGT TGTGTGTCTG TTCTGTTTCT CTGCTTGCCG AACTTTCTCA 1320 

ATAAACCCTA TTTCTTATTT ATAAAAAAAA AAAAAAAAAA AAAAAA 1366 



45 (2) INFORMATION FOR SEQ ID NO: 50: 

Ci) SEQUENCE CHARACTERISTICS: 

<A} LENGTH: 1405 base pairs 
(B) TYPE: nucleic acid 
50 (C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

55 GCAGTAATTC CTGTTAGCCA CTGCATCCAC CAAAACTAGT TTATTTTTCC CCTCAAATTC 60 

ATGAITTTTA CGTCTGTTAC AAAGGGAATT TTGCTGATAG CTCTTTGGGT CCCACTGTTC 120 

CATTTTATGC TAATAGATTC CATTCTAGGG CCCAGCCGTC TCTTGACTGA TGGTGTTCCC 180 

60 
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TTTAACCCTT GGCATGTATA ATAGAATTTT GGTGAATGAA AGAACCCAAA TAGGCCAGAT 240 

AGTCCCCCCA GGCCCTGATA TCCATAAAAG GCTTGGGAAT GCATTATGTA ATTGTCCTTA 300 

5 GTCTTTTTGT TGTTTTAGAA AAAAAAAACA AGATGGGCTC AGATGGATGC CTACGTAAAA 360 

ATGGTTCCTA GCTGTGTACT CATAACTTTT CTTTGAATTG AGTAGTGAAA GGAAGGAGGA 420 

GGAAAGGAAA TTAAATGTCC TTCTAGTATT CTCTGGACTC AAGTCTGACA TATGAGATAA 480 

10 

TAACCTATAT TGAAATGCCA AGAATTGTAT CTGAAACAAG AGAACAGTTT GACACATTTA 540 

TCATGCCTTC ATATTACATA TTAACTGAAA CCAATTAATA AACATATGAA ATATCCATTG 600 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 51: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 504 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



660 



15 CACAAGGCAA AGGCACCTAA ACCTTTTGTT TCTTTTTCTA CATAGCAGAA ATTGATTTTT 

TTTTTATTTT TTTAGGGGAA CCTATATAAT TATGACCCAG TGATGTCTTT TGGTGACTTA 720 

AGCTTATGAA TTCAGGTTAC AATTGAGTTG ATTCTAGATG GTTACTACCT TGAAAAGGAT 780 

20 

GTTGGTGCCT TATGTGACAC GAGCCAGAGC CTGCTGGGGA ATAAACAAAG CAGGTTTCAT 840 

GCCAACACCA ACTCGTAGCT TTAGTGGGCA GATGGGGAGT GGTTCACAGA CTTCCCAAAA 900 

25 TGTGGGGGCT TTGGGATTTT CCACACCATC CCACGTGTGT TGTTCATTCT TCCTCTTTTC 960 

ACACTCTTGG ATGGATWATT TGRAAATGGT GRAAWYMMCY YYKRAATTTG CCCAATAGCC 1020 

WTGRGCCACC ATTCTTWATG ACACCATAAC CAAATAGTTC CWTAATGTTG AAATATTAGA 1080 

30 

AACCTGTTAC CAGCCYKSMA KTOACCCWWA WTTTTCCCAT GTTTGTGGAA TTGATATTGA 1140 

AATAGCAGGG CTAAGGAATT ACTGGCAAGT TTTAGCCTGT GGGTAATACC TTAGGGTTAT 1200 

35 TTAAATATTT GTAATTTTAT TTAAATGTTC ATGAATGTTT GAAAGGAACA AAATTATCAG 1260 

GGATGGCTCT TTGCCATGGG TCTTATTTTC ACCCTCTTTT CTGTAAGAAA AAAGAACAAT 1320 

GTCTTAATGT ATTTTTAAAG TTTTTGGTAT AGTTTCTAAT TCCAATTTTA ATAAAAGTTT 1380 

40 

WTRTAAAAA AAAAAAAAAA AAAAA 1405 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

CGGATTTTCT AGGACCCCAA AAAAAAAAAA AGGGNAAAAA AAACCCNCAA AACCANCCAA 60 

AACCCCAAAA AAAAAAAAAA TCC AC AAAAA CAAAAAAACT ATAAAAAAGA AAGAATTAAA 120 

60 AACTTTCAGA GAATTACTAT TTACTTTATT AACTTACGGA TTTATTATAT AAATATATAT 180 
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TCACCTAGCA ACATATCTCT GCCGTCTCTC CTGCTCTCAT AATGAAGACA TAGCCGATTC 240 
^ TCTGCCCGGG CCCCTTGCTG ATGCTCCTCC GGGTCTGCGT CGGGCGTGGG TCTCTGGGGA 300 
CCCTCCAGAG GTCGAGGTGG GCTGATGGCC TGGCTGCCTG GTGGTTGATX3 GTTTTGCTCC 360 
CCCTACCTTT TTTTITTGAG TTTATTCTGA TTGATTTTTT TTCTTGGTIT CTGGATAAAC 420 
10 CACCCTCTGG GGACAGGATA ATAAAACATG TAATATTTTT AAGAAGGAAA AAAAAAAAAA 480 
AAAAAACTNG GGGGGGGCCC CGAA 5 04 

15 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH : 777 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

NAAGTATCTT GGCCAGTTTA TTACAGAGGA CGATAAATGA TTCCATGTGG ATAGGGCATA 60 

^ ACATACAGAG AATGAGACTA TGCCAGAAAT GGGAGGAGGC ATTTGAAACA ACATGAGTAT 120 

CTCAGGGACA GATGGATTGA TTCTGCTATT GGTAGGCCTG GAAGCAANGG TCAGAAGTAG 180 

CAAAAAATGG ATACCAAAAG CACTATTWGT CACCCAAGCT AAGTGGAATA GCTGGCCCAG 240 

35 TAGGAGAAAT GCAGGTTTTG CTCTACACTA AGTTCTCCAA CTCTTGATAA GCCTCCAAAA 300 

ACAAATGTTA GGGGAAAAAA ACGCAGCTGG TTATGAAAAG ATATATCTCA TTTCATTAAA 360 

AAATCAATGT CAATGCTGTT AATAGAATCC TTTTATCTTC AGGACAGAGG CAATGCCCTA 420 

40 

AACAAACACC AGCTCAAGAG CCTCTGATGC CAACCTAGAG GGTACCCAAA CACAAACTTA 480 

GCATAGAGGT AAGAATCTCT ATGTCTTTTG GTGGAGGCAA AGCCATTTGG TTGGTACTTC 540 

45 ACAGGAACAT CTTTCTACCA AGTCTTCATC ATATGGTATG TGCCACGAGT CTCCAGTTGT 600 

TTGCACCACT GTGTCATAGC TGAGAATACG CTGAAAGGTT AGTTTTGATC CTGGAAACCT 660 

ATTTACAATT GCCAGCTGAT GTCCCTGCTG CCACTTAAAA AAGGCTTGGG TCTGGCATAG 720 

GCAGAMAGGC CTGTGGTCCC CTCGTGCCGA TTCTNGGCTC GAGGCCAATT NCCTTAT 777 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 53: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 602 base pairs 
60 (B) TYPE: nucleic acid 
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<C> STRANDEDNESS : double 
<D) TOPOLOGY: linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

ATGACTACAG TGTTATACCC TCCAATCTTT GCAGGTGGGC ATGGAACACT GCTTGTATCA 60 

CTCTGTGCAC GGTATAAATC CATATATCCA CAAAAACACA CATCCATCCA TCAACATATA 120 

10 CATGGTTTGG GATGAGCAGG TCAATAGTTT TGAGAGGGAG TTTGTTCCTT TTTTTTTTCT 180 

CATTATACTC TTAAATTGTT GTCAGTTATC AAACAAACAA ACAGAAAAAT TGTTTGGAAA 240 

AACCTTGCAT ACGCCTTTTC TATCAAGTGC TTTAAAATAT AGACTAAATA CACACATCCT 300 

15 

GCCAGTTTTT TCTTACAGTG ACAGTATCCT TACCTGCCAT TTAATATTAG CCTCGTATTT 360 

TTCTCACGTA TATTTACCTG TGACTTGTAT TTGTTATTTA AACAGGAAAA AAAACATTCA 420 

20 AAAAAAGAAA AATTAACTGT AGCGCTTCAT TATACTATTA TATTATTATT ATTATTGTGA 480 

CATTTTGGAA TACTGTGGAA GTTTTATCTC TTGCATATAC TTTATACGGA AGTATTACGC 540 

CTTAAAAATA CGAAAATAAA TTTTACAAGG TTCCGGTTTT GGTGGTGGAA AGAGTAAATT 600 

GA 602 



25 



30 



40 



(2) INFORMATION FOR SEQ ID NO: 54: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1749 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

AGTCACTGAC TTGGAGCCGC TCGGGGGAAG TCCCGCCCAG ACAGGCGGTG GGTGGGAATG 60 

CCTCACTTCA GTTTGAAGAG GGTCCGGATC CAAAGGGGTT AAAACGAGCG AACCCCGATC 120 

45 CCCGACCACA CTTCCCGCCT CCCTAAAACG CACACCCCGC TAGCCATGGG CAGCCGCGAC 180 

CACCTGTTCA AAGTGCTGGT GGTGGGGGAC GCCGCAGTGG GCAAGACGTC GCTGGTGCAG 240 

GATTATTCCC AGGACAGCTT CAGCAAACAC TACAAGTCCA CGGTGGGAGT GGATTTTGCT 300 

50 

CTGAAGGTTC TCCAGTGGTC TGACTACGAG ATAGTGCGGC TTCAGCTGTG GGATATTGCA 360 

GGGCAGGAGC GCTTCACCTC TATGACACGA TTGTATTATC GGGATGCCTC TGCCTGTGTT 420 

55 ATTATGTTTG ACGTTACCAA TGCCACTACC TTCAGCAACA GCCAGAGGTG GAAACAGGAC 480 

CTAGACAGCA AGCTCACACT ACCCAATGGA GAGCCGGTGC CCTGCCTGCT CTTGGCCAAC 540 

AAGTGTGATC TGTCCCCTTG GGCAGTGAGC CGGGACCAGA TTGACCGGTT CAGTAAAGAG 600 

60 



WO 98/39448 



276 



PCT/US98/04493 



AACGGTTTCA CAGGTTGGAC AGAAACATCA 
ATGAGAGTCC TCATTGAAAA GATGATGAGA 
5 ACCCAAGGGG ACTACATCAA TCTACAAACC 
TGTTTGGCTT ATTTTCCATC CCAGTTCTGG 
CCCACCTGAC CATTTTATTA AGTACATTTG 

10 

GGGCCCATTG TCACTTAGAA AAGACACCTG 
TTAGCCTTTC ACATGTTGCT GRCTCACATT 
15 TTCTCATCAG CCCTCAATTT GTGATCCGGA 
TGCGTTTTAG AGATCATAAT TCTCACCTAC 
TTGATATCAT GACTTCCAAT TGAGAGGAAA 

20 

TGTAGGCCGT TGTTTCAGAT TCTTTCTGTC 
AGAAGGAGGG GTCTGGGCAT CTGTGGATTT 
25 GTATTTTTGA AACTTCTAAC GTCATAATTA 
CAAGTTTTTT GGCCGGGCAT GGTGGCTCAT 
GGCAGGCGGA TCACATGAGG CCAGGAATTC 

30 

CGTCTCTACT AAAAGTACAA AAATTAGCCA 
TACTCTGGAG ACTGAGGTGG GAGAATCGCT 
35 CGAGATCATG CCACCGCACT TCAGCCTGGG 
AAAAAAAAAA AAAACTCGAG GGGGGGCCCG 
ACAATCNAA 

40 



GTCAAGGAGA ACAAAAATAT TAATGAGGCT 660 

AATTCCACAG AAGATATCAT GTCTTTGTCC 720 

AAGTCCTCCA GCTGGTCCTG CTGCTAGTAG 780 

GAGGTCTTTT AAGTCTCTTC CCTTTGGTTG 840 

AATTGTCTCC TGACTACTGT CCAGTAAGGA 900 

GAACCCATGT GCATTTCTGC ATCTCCTGGA 960 

AGTGCCAGTT AGTGCCTTCG GTGTAAGATC 1020 

ATTTTGTGAG AAGGATTAGA AATCAGCACC 1080 

TTCTGAGCTT ATTTTTCCAT TTGATATTCA 1140 

ATGAGATCAA ATGTCATTTC CCAAATTTCT 1200 

TTGGAATGTA AACATCTGAT TCTGGAATGC 1260 

TTGGCTACTA GAAGTGTCCC AGAAGTCACT 1320 

AGTTTCTCTT GTCTTGGCAT CAAGAATAGT 1380 

GCCKGTAATC CCAGCACTTG GGGAGGCCAA 1440 

GAGACCAACC TGGTCAGCAT GGCAAAACCC 1500 

GGCGTGATGG CACGTGTCTG TAATCCCAGC 1560 

TGAGACTGGG AGGCAGAGGT TGCAGTGAAC 1620 

TGACAGAGAA GGACTCCGTC TCAAAAAAAA 1680 

GTACCCAAAT CGCCSTGATA GTGATCGTAW 1740 

1749 



45 



(2) INFORMATION FOR SEQ ID NO: 55: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1896 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
50 (D) TOPOLOGY: linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

^ AAAGAGATGG GCTCTTTATT TTCTCGAAAA ACCAATTTGG AGTTACTCAT TTTTCCATAA 60 

CATTAAATTT CTTACAGTGA ACTACATATT GTCCATAAGT GCTTCATCAG GACTCATCGC 120 

CCTCCTGTCT ACTGGCTCCA AATAGACCAT GTCAGCTTCA CCCCCTGGCV TTGTGTCTAT 180 

60 GGGTGGCCTG TGGTATATGG AAAAGTAGCA GGGTGGTCAG GGTGGGAGAC ACAAGATGTT 240 
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300 



15 



20 



30 



35 



45 



50 



55 



480 
540 
600 



840 
900 



TTTATAGTCT AGAGCCTTTA AAAAACCCAG CAGAATGTAA TTCAGTATTT GTTTATTGGC 

5 TGTTTTTTGA CAGATTGTTG AAATTAAATG AATTGAAAGG GAAACTCAGA GTACTAGGAC 360 

GTTTATTAAA AGGAAAAAAA TGTCTTGCAA TGTGCTGTAA TCACAAGAGG AGAAAATAAC 420 
TTGTTTCCTT GATCTGTCAG AGGTCACAGT AACCTGGGCC GAGCTGTTAT TATTTATTAT 
10 ATAATAGTAG TAGGAAGTTA ATAACTGGTT CTCTGTGTTC CAAGCACAAT ATTACAACTT 
CTTTTGAACC GTAAATATCA GAATGAATCC TCTTCCCAGG GGATTGAACA GAAGCTTAAT 

GTTTACAAGT GTTTGAATTT GTGATCTGAA ATAACACAAA ATTAAAAACA TGATTTCTCT 660 

AATTTTCCAA CTAGAGGAAG AGAAACTTGT GGAAAAGTTC TTTTTTTTTC TTTTTTTTTT 720 

CTTAAAGAAG GGCAGCCAAG GTAGTAACCT AAAAATAGTG CCCAGGCATA TGAGAGTTGT 780 
CCTACGAGGT TAAAGAACAC ACTGTTCCAC TGTATGGCTT TGGCCCTGAG TGGCCAGGGA 
GGTCAACTTG ACCCTGCCAT GTTGGTTTGA CTTACTAAGA CACAGGAATC ATTGTTTTCC 

TTGACCAGGG TCTCACACCC TGGAGGAATG TTAAGTAAGA GAAAGAACCT CTTTCCTGAA 960 

TATTGACATG TAAAAGACCA AAGTAATTTT TCTGAACTTC TGCAATTCTG AGAACTCTCC 1020 

AAGGAATTTA CAGTGATTTT AGTGCTTGTC AGCATTTTTC CATGAGGACT TTCATACATT 1080 

TGACTCTTTA GTTCACAGGT TCCCATTGAT TGTGAGCAAG ATATTTATCT CTTTAGCCCT 1140 

TGGGGATCCA GCTGAGAGCA ATCTCTTGCA TTTTTTTACC CGTGTATGTA CAGATATCAT 1200 

1TCTTGTGTA TGCCATGACT TGAAAAAGTT TGGGAAGCTC TTTAGCAATA TCAGCTAAAA 1260 

GGATATGAAA TCACAGGTGA TAGCAGTTGT CATTCAGTAA TTTCCTACAA GCAGCACCCC 1320 

AAAGGAAATA TAGTCCTAAT CTTTACTATC CACTTCTAAA TTTAATGTCA ATTTCATACA 1380 

40 TGTTATTAGT TGTTTTCTTT ATAATTTTAT AAAAATTATT CATCGGGAGT TTAACTTCCA 1440 

CTTCCATGCT ATCGGATGTG TTGGGCTCCA TGCAAGAACT TGGAAGAAAA ACAGGCAGGA 1500 

ATGCATTTGC ATAATGACCC AGATCATCAT TTTCTGCAAC TGAGAATTAT ATTTCATCAT 1560 

TGCTTCTAGA AGTCTGCAAT TCTTTACTTT TCTTTGGTGC ATTATTATCT AGGTGCCATC 1620 

ACTGGATAAT GTGGAGTGAC TAGAGAAGTC AYATATCACT GTAAGGTACA GTTAGGOGTA 1680 

ACACTTTAGA GGTTTATTAT TTTTAAAAAA CTTTTCTTGA ACTCCTGGGC CAACATGGGT 1740 

GAAACCCCGT CTTCTTACTT AAAAATACCC AAAATTAGGC CAGGGGCGTG GATGGGTGGG 1800 

GTGCCTGTTA ATCTTCAGCT ACTTNGGGGA GGGCTTGAAG CCAGGGAGGA ACTGCCCTGG 1860 

ANCCCCGGGG NGGGCCAGNA GGTTTGCCAG TTGAGT 1896 



60 
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(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1753 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

TCTTTTTAAA ATAGACATTT GTGGGGCTCA CACAATATAT GAAATAGTAC CCTCTAAAAA 60 

AGAGAAAAAA AAAATCAGGC GGTCAAACTT AGAGCAACAT TGTCTTATTA AAGCATAGTT 120 

15 TATTTCACTA GAAAAAATTT AATATCAAGG ACTATTACAT ACTTCATTAC TAGGAAGTTC 180 

TTTTTAAAAT GACACTTAAA ACAATCACTG AAAACTTGAT CCACATCACA CCCTGTTTAT 240 

TTTCCTTAAA CATCTTGGAA GCCTAAGCTT CTGAGAATCA TGTGGCAAGT GTGATGGGCA 300 

20 

GTAAAATACC AGAGAAGATG TTTAGTAGCA ATTAAAGGCT GTTTGCACCT TTAAGGACCA 360 

GCTGGGCTGT AGTGATTCCT GGGGCCAGAG TGGCATTATG TTTTTACAAA ATAATGACAT 420 

25 ATGTCACATG TTTGCATGTT TGTTTGCTTG TTGAATTTTT GAACAGCCAG TTGACCAATC 480 

ATAGAAAGTA TTACT1TCTT TCATATGGTT TTTGGTTCAC TGGCTTAAGA GGTTTCTCAG 540 

^ AATATCTATG GCCACAGCAG CATACCAGTT TCCATCCTAA TAGGAATGAA ATTAATTTTG 600 

TATCTACTGA TAACAGAATC TGGGTCACAT GAAAAAAAAT CATTTTATCC GTCTTTTAAG 660 

TATATGTTTA AAATAATAAT TTATGTGTCT GCATATTGCA GAACAGCTCT GAGAGCAACA 720 

35 GTTTCCCATT AACTCTTTCT GACCAATAGT GCTGGCACCG TTGCTTCCTC TTTGGGAAGA 780 

GGAAAGGGTG TGTGAACATG GCTAACAATC TTCAAATACC CAAATTGTGA TAGCATAAAT 840 

AAAGTATTTA TTTTATGCCT CAGTATATTA TTATTTAATT TTTTAGGTAA TGCCTATCTC 900 

40 

TTGGTCTATT AAGGAAAGAA GCAATCAGTA GAGAATTCAG GATAGTTTTG TTTAAATTCT 960 

TGCAGATTAC ATGTTTTTAC AGTGGCCTGC TATTGAGGAA AGGTATTCTT CYATACAACT 1020 

45 TGTTTTAACC TTTGAGAACA TTGACAGAAA TTATGCAATG GTTTGTTGAG ATACGGACTT 1080 

GATGGTGCTG TTTAATCAGT TTGCTTCCAA AGTGGCCTAC TCAAGAGGCC CTAAGACTGG 1140 

^ TAGAAATTAA AAGGATTTCA AAAACTTTCT ATTCCTTTCT TAAACCTACC AGCAAACTAG 1200 

GATTGTGATA GCAATGAATG GTATGATGAA GAAAGTTTGA CCAAATTTGT TTTTTTGTTG 1260 

TTGTTGTTGT TTTGAATTTG AAATCATTCT TATTCCCTTT AAGAATGTTT ATGTATGAGT 1320 

55 GTGAAGATGC TAGCGAACCT ATGCTCAGAT ATTCATCGTA AGTCTCCCTT CACCTGTTAC 1380 

AGAGTTTCAG ATCGGTCACT GATAGTATGT ATTTCTTTAG TAAGAATGTG TTAAAATTAC 1440 

AATGATCTTT TAAAAAGATG ATGCAGTTCT GTATTTATTG TGCTGTGTCT GGTCCTAAGT 1500 

60 
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GGAGCCAATT AAACAAGTTT CATATGTATT TTTCCAGTGT TGAATCTCAC ACACTGTACT 1560 

TTGAAAATTT CCTTCCATCC TGAATAACGA ATAGAAGAGG CCATATATAT TGCCTCCTTA 1620 

5 TCCTTGAGAT TTCACTACCT TTATGTTAAA AGTTGTGTAT AATTGTTAAA ATCTGTGAAA 1680 

GAATAAAAAG TGGATTTAAA TTAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1740 
AAAAAAAAGG GGG 



10 



1753 



15 



(2) INFORMATION FOR SEQ ID NO: 57: 



( i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1220 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
20 (d) TOPOLOGY : linear 



25 



30 



35 



40 



45 



60 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

GCGGAAGTTA CTGCAGCCGC GGTGTTGTGC TGTGGGGAAG GGAGAAGGAT TTGTAAACCC 60 

CGGAGCGAGG TTCTGCTTAC CCGAGGCCGC TGCTGTGCGG AGACCCCCGG GTGAAGCCAC 120 

CGTCATCATG TCTGACCAGG AGGCAAAACC TTCAACTGAG GACTTGGGGG ATAAGAAGGA 180 

AGGTGAATAT ATTAAACTCA AAGTCATTGG ACAGGATAGC AGTGAGATTC ACTTCAAAGT 240 

GAAAATGACA ACACATCTCA AGAAACTCAA AGAATCATAC TGTCAAAGAC AGGGTGTTCC 300 

AATGAATTCA CTCAGGTTTC TCTTTGAGGG TCAGAGAATT GCTGATAATC ATACTCCAAA 360 

AGAACTGGGA ATGGAGGAAG AAGATGTGAT TGAAGTTTAT CAGGAACAAA CGGGGGGTCA 420 
TTCAACAGTT TAGATATTCT TTTTATTTTT TTTCTTTTCC CTCAATCCTT TTTTATTTTT 



480 



AAAAATAGTT CTTTTGTAAT GTGGTGTTCA AAACGGAATT GAAAACTGGC ACCCCATCTC 540 



600 



TTTGAAACAT CTGCTAATTT GAATTCTAGT GCTCATTATT CATTATTGTT TGTTTTCATT 

GTGCTGATTT TTGGTGATCA AGCCTCAGTC CCCTTCATAT TACCCTCTCC TTTTTAAAAA 660 

TTACGTGTGC ACAGAGAGGT CACCTTTTTC AGGACATTGC ATTTTCAGGC TTGTGGTGAT 720 

AAATAAGATC GACCAATGCA AGTGTTCATA ATGACTTTCC AATTGGCCCT GATGTTCTAG 780 
50 CATGTGATTA CTTCACTCCT GGACTGTGAC TTTCAGTGGG AGATGGAAGT TTTTCAGAGA 
ACTGAACTGT GGAAAAATGA CCTTTCCTTA ACTTGAAGCT ACTTTTAAAA TTTGAGGGTC 

TGGACCAAAA GAAGAGGAAT ATCAGGTTGA AGTCAAGATG ACAGATAAGG TGAGAGTAAT 960 

GACTAACTCC AAAGATGGCT TCACTGAAGA AAAGGCATTT TAAGATTTTT TAAAAATCTT 1020 

GTCAGAAGAT CCCAGAAAAG TTCTAATTTT CATTAGCAAT TAATAAAGCT ATACATGCAG 1080 

AAATGAATAC AACAGAACAC TGCTCTTTTT GATTTTATTT GTACTTTTTG GCCTGGGATA 1140 
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TGGGTTITAA ATGGACATTG TCTGTACCAG CTTCATTAAA ATAAACAATA TTTGTAAAAA 1200 
TCAWAAAAAA AAAAAAAAAA 



1220 



10 



(2) INFORMATION FOR SEQ ID NO: 58: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1049 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
15 (D) TOPOLOGY: linear 



20 



50 



60 



240 
300 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
TCGCGCCTGC AGACACAGCA TCTACTCAGC GTGGGTCACC TCTGTGAACA TCACTGACTG 
CAAGCCTCCC TCAATTTCTG GTGCAGCCCA TCAGGGACCC ACAGCGCCTG GGAGGATGGT 120 
GCGGATCTTG GCCAATGGGG AAATCGTGCA GGACGACGAC CCCCGAGTGA GGACCACTAC 180 
25 CCAGCCACCA AGAGGTAGCA TTCCTCGACA GAGCTTCTTC AATAGGGGCC ATGGTGCTCC 
CCCAGGGGGT CCTGGCCCCC GCCAGCAGCA GGCAGGTGCC AGGCTGGGTG CTGCTCAGTC 
3Q CCCCTTCAAT GACCTCAACC GGCAGCTGGT GAACATGGGC TTTCCGCAGT GGCATCTCGG 360 
CAACCATGCT GTGGAGCCGG TGAOCTCCAT CCTGCTCCTC TTCCTGCTCA TGATGCTTGG 420 
TGTTCGTGGC CTCCTCCTGG TTGGCCTTGT CTACCTGGTG TCCCACCTGA GTCAGCGGTG 
35 ACCTCTGAGG GCTGATAGGG GTGGGTTTGT TGAGAGGGAC TTGCTGGGCC TTGGTGTGAG 
AGCAGGCATA TTTGGAGGGG ATCTGGTGGT GCCTTGAAGG TATGATCAGA GAGGGGACCA 
4Q CAGGTGTGTG TTTCCCCTTT GTGTTAAGCG TGAGGCAGAG GGAGACGTTA GTCCCAGCAT 660 
TTCCCAAAGT GTGGGTGGGT CCGTTGGTTC CCGAGATACT TTTAGGTOGT ATGGGGCCTG 720 
CATTAAGTGG CACAAAATCA GAGCAAGAAA GCGATGCCCT TCCCAATTCT CTCAATCCTT 780 
45 TTATGCCGAG AAGATCTCAG CTGGATGCCA ACATGTTCCG ATGCCTGTGG AAGACATGCC 
GACGTCTCCT CTGCCTAGGG AGCAGGACTT GGGCTTAGGG CAGGTGGAAA AAATTCCAGA 
CTTTTTTAGC ACTGTTTTTG TTTTAATGGT ATATTTTTAT TGGCTACTTT ATTGTTTAGG 960 
ACAAGTGGTA GTGGCATTCT ATTTATTGTG ACCTTTTCAA TAAATAGATT TAAGTAAAAA 1020 
AAAAAAAAAA AAAACTCGAG GGGGGGCCC 



480 
540 
600 



840 

900 



1049 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1776 base pairs 

(B) TYPE: nucleic acid 
tC) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

AAAGAGGATG TGMAGCTAGA GGTCCCCGAT GGCTGGTCGG ATGGGAAGCA CAAGGCTGAG 60 

GGACTGGATT GTAAAGGCAC TAAGTCGTTC TGCGGTGAGA ATCAGACATG GGGGACCTCT 120 

AGCTTCACAT CCTCTTTCCT TGCAGSTCTG GACATCCTGA GCCCAAGTCC CCCACACTCA 180 

GTGCAGTGAT GAGTGCGGAA GTGAAGGTGA CAGGGCAGAA CCAGGAGCAA TTTCTGCTCC 240 

TAGCCAAGTC GGCCAAGGGG GCAGCGCTGG CCACACTCAT CCATCAGGTG CTGGAGGCCC 300 

CTGGTGTCTA CGTGTTTGGA GAACTGCTGG ACATGCCCAA TGTTAGAGAG CTGGCTGAGA 360 

GTGACTTTGC CTCTACCTTC CGGCTGCTCA CAGTGTTTGC TTATGGGACA TACGCTGACT 420 

ACTTAGCTGA AGCCCGGAAT CITCCTCCAC TAACAGAGGC TCAGAAGAAT AAGCTTCGAC 480 

ACCTCTCAGT TGTCACCCTG GCTGCTAAAG TAAAGTGTAT CCCATATGCA GTGTTGCTGG 540 

AGGCTCTTGC CCTGCGTAAT GTGCGGCAGC TGGAAGACCT TGTGATTGAG GCTGTGTATG 600 

CTGACGTGCT TCGTGGCTCC CTGGACCAGC GCAACCAGCG GCTCGAGGTT GACTACAGCA 660 

TCGGGCGGGA CATCCAGCGC CAGGACCTCA GTGCCATTGC CCGAACCCTK AANAAAAACC 720 

ATTAAAGTTA CGACGGCAGC AGCAGCCGCA GCCACATCTC AGGACCCTGA GCAACACCTG 780 

ACTGAGCTGA GGGAACCAGC TCCTGGCACC AACCAGCGCC ASCCAGCAAG AAAGCCTCAA 840 

AGGGCAAGGG GCTCCGAGGG ANCGCCAAGA TTTGGTCCAA GTCGAATTGA AAGRACTGTC 900 

GTTTCCTCCC TGGGGATGTG GGGTCCCAGC TGCCTGCCTG CCTCTTAGGA GTCCTCAGAG 960 

AGCCTTCTGT GCCCCTGGCC AGCTGATAAT CCTAGGTTCA TGACCCTTCA CCTCCCCTAA 1020 

CCCCAAACAT AGATCACACC TTCTCTAGGG AGGAGKCAAA TGTAGGTCAT GTTTTTGTTG 1080 

GTACTTTCTG TTTTTTGTGA CTTCATGTGT TCCATTGCTC CCCGCTGCCA TCCTCTCTCC 1140 

CTTGTTTCCT TAAGAGCTCA GCATCTGTCC CTGTTCATTA CATGTCATTG AGTAGGTGGG 1200 

TAGCCCTGAT GGGGGTCGCT CTGTCTGGAG CATAACCCAC AGGCGTTTTT TCTGCCACCC 1260 

CATCCCTGCA TGCCTGATCC CCAGTTCCTA TACCCTACCC CTGACCTATT GAGCAGCCTC 1320 

TGAAGAGCCA TAGGGCCCCC ACCTTTACTC ACACCCTGAG AATTCTGGGA GCCAGTCTGC 1380 

CATGCCAGGA GTCACTGGAC ATGTTCATCC TAGAATCCTG TCACACTACA GTCATTTCTT 1440 

TTCCTCTCTC TGGOCCTTGG GTCCTGGGAA TGCTGCTGCT TCAACCCCAG AGCCTAAGAA 1500 

TGGCAGCCGT TTCTTAACAT GTTGAGAGAT GATTCTTTCT TGGCCCTGGC CATCTCGGGA 1560 

AGCTTGATGG CAATCCTGGA AGGGTTTAAT CTCCTTTTGT GAGTTTGGTG GGGAAGGGAA 162 0 
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20 



GGGTATATAG ATTGTATTAA AAAAAAAAAG GTATATATGC ATATATCTAT ATATAATATG 1680 

ACGCAGAAAT AAATCTATGA GAAATCTATC TACAAAMWAA AAAAAAAAAA AAAAAAAAAA 1740 

AGGAATTCGA TNTCAAGCTT ATCGATACCG TCNACC 1776 

(2) INFORMATION FOR SEQ ID NO: 60: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 443 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

ACAGATAAAT AAATAAATAA TAAATTAAAT TAAATAAAAA ATCTGAGCTA ATCTGAATAA 60 

ATTGAGAGAT TTCACATGAA AGCCAGGATT TCTGGCTTCC CAGGAACAGT CAGAAGAGCT 120 

25 AGCTAGCAAC ACTGGTCTGC TTGGCTACCT TCTTTGGAAC AACATGAAAT CTAGCTCCCT 180 

TTTTTTTTTT TTTTTGGCCC ACTTCATCCA TTCACATGAC CTGCCTQGCC TCTGCAGGTA 240 

AGTGAGTATG CAACAAAAAT GTAGCACAGG TTTTGTCGCT GAACTACGTG GTTTCAGGTC 300 

30 

CAGCTCTGCC ACTTGCTAGC ATGACCTCGT GCCGAATTCC NGCACGAAGT ' ITl ' lTl ' l ' l ' lT 360 

TTTTTCAGTG CTCCAGTCCC CCTATTGGAG AATCCTGCCC CCCCCTQGGA CAGAATGTTC 420 

35 ACCCTGGCCC CGCGANTCCC TGA 443 



40 (2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2888 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

50 TTAATGTTGT CAATAACCAC CAGGCCAAAC AGAATTTATA TGACCTGGAT GAAGATGATG 60 

ATGGTATAGC TTCCGTTCCT ACTAAACAGA TGAAGTTTGC AGCCTCAGGC GNCTTTCTCC 120 

ACCACATGGC TGGGCTAAGC AGTTCCAAGC TTTCCATGTC CAAGGCCCTC CCTCTCACCA 180 

55 

AAGTGGTTCA GAATGATGCA TACACAGCTC CTGCTCTCCC TTCCTCTATT CGAACAAAAG 240 

CCTTGACCAA CATGTCCCGG ACACTGGTGA ACAAGGAAGA ACCCCCCAAA GAGCTGCCAG 300 

60 CTGCTGAGCC TGTTCTCAGC CCATTGGAAG GCACCAAGAT GACTGTGAAT AATCTGCACC 360 



WO 98/39448 



283 



PCT/US98/04493 



CTCGAGTCAC TGAGGAGGAC ATTGTTGAGC 
^ CTCGACTGGT CCATCCTGGG GTAGCGGAGG 
CCGCATATAA GAAGTACAAC AACCGGTGTC 
ACATGAATGG GAATGTTATC ACCTCAGACC 
10 CATCAATGAA AAAGGAGAGC GAGCTGCCTC 
CCCCTGCYGA AGTGGACCCT GACACCATCC 
CTKTGACCAC GCAGCCCACA GAATTCAAAA 

15 

AGAAGTGGGG GCAGAGGAGG GTGGCTCTGT 
CCATCGGACT GGAGACCCCT GATTGTGGGA 
20 TGGATGGGAC CCGCCTTTCT GTGTTGTGTT 
TTTCCTGTAG TATGTTTCTT CATCTCATCG 
CCTCCCCGAG CCTCAGCCCC AAGCTGATTT 

25 

GGGTGGCTTT CTTGTGGCCC CATGGGATGC 
TTTCCAGGGG CCGAGGGGCT GCCTTTCCTT 
30 GGGATGGAGA GCCCTQGTGT CCTGACGGGA 
TCTGTCTTGT CAGTGGAGGT GCCTGGGTGG 
CCAGTGGCTC CAGGCCTCAC TAGTGGCAAG 

35 

TCTATCTAAG YTCTTGGCTT GGAGTCCCGT 
GTTCACCTTT CCCTTTTCCT TGAGTTGTGC 
40 CTGGGTGTCT TTGCTGGGAG GGGGCTGTGT 
GCACTTAACC ACACCCTGGT TTTGTGTAGC 
AGGCTCAGCC TCCCATTGTG CAGTGCTTGG 

45 

GTTTGTTCCT GGCTCTCCAT TTCTGGCCTC 
CTGGAGGCAT ATATCCAGCT GCCACCAAGG 

50 CCCCATCCAT CCATGACCAG AGGATTATTT 
GGAGCAGGGC AGCTCTACCA GGCAAGGTGT 

^ GAAACTTCAG AGCCCAGGCA GTCCCTGAAT 
CCTGCTGGTT GGGAGTGAAG AGAATCCAGG 
GGTTCTGGGA GCTCTGCAAA ATCAGTAGCA 

60 TCAAGAGCTC CCAAGATTTG CTTGAGGCTA 



TTTTCTGTGT GTGTGGGGCC CTCAAGCGAG 420 

TGGTGTTTGT GAAAAAGGAC GATGCCATCA 480 

TGGACGGGCA GCCGATGAAG TGCAACCTTC 540 

AGCCCATCCT GCTGCGGCTG AGTGACAGCC 600 

GCAGGGTGAA CTCTGCCTCC TCCTCCAACC 660 

TGAAGGCACT CTTCAAGTCC TCAGGGGCCT 720 

TCAAGCTTTG AGCAGGGGAG TGAGGCAGCC 780 

TTCCCCAAGG CAAAGCTTAT GACCAATGGG 840 

AGGGTTGCCA GGGATAAAGA GCTTCCTCAC 900 

CTGCCCTGTG CTCTTCTCTC TACGTTAACG 960 

CCAAGGTAGG CTTGTGTTTT TCAGTGTCTG 1020 

CTTATCTGGA AATGGTACAC TGAATTCTCT 1080 

AGCGTGGGGG CTGTCTGAAG GACCCTGCTT 1140 

TGTGTGTATT AAGCTTTTCA AACAATGGAG 1200 

GCCAGGTCGG CCTGAGAGCT GTGCCGCTCC 1260 

GGAGCAGGTC TCAGGCCTCT TGTCCTCTCC 1320 

GGCAGGATGA GGCTGCACCG CTGGGAAGAG 1380 

GTCGTCTCCR CCCAGAGGAA GTTCTCCAGA 1440 

TGAATGCCCC ACCCCAGCTC TCTTTCCCTT 1500 

TGTGAGCCCT CCCGGTTCTC ACCTCGCCTG 1560 

CGCCAGCTCT CTTCTGGTTG GGCCTTTGAA 1620 

GTTTGGAGCT TATTTGAATG GAAGAGGTCA 1680 

AGTTGTCTAC AGGACAGTGG TCAGGGATGC 1740 

GGCACTGTTT GTTCCCACTT ATGTGAGTGA 1800 

TCCTGCCTTG GCAGAGGAGG AGGAGTCAAG I860 

TTCCCCAGCA TAGGCGCAGA CAGTTGGGAC 1920 

GACCAGGCCA GTGTTGTCAC TGAGTGGTCC 1980 

CTGGCAGAGC TGGAGCCAGT TGGGGAGCAC 2040 

AGTGCTGGAA AAGGCACATG CCGAAGATAC 2100 

GCCCAGTGAA RAAAACCAGA GACTCATGTT 2160 
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TCCAGGGGTC AGTCTGTCAG GCAGGAAGGA CCCAGGATTT GAACCCAGCT TCAGTGTGCA 2220 

^ GGCTCTGAGG CTGCCCAGGA CGGGAAAGTC CAAGGAAGGG GCCTGGTGGT GCTCCACTTG 2280 

CAGTTCTTTA AAGAATGCTG CTTTTTATTC TCCTAACCCT TTCAAGTGGG TGCAGACTTC 2340 

TCGTTAGCAG CTGGAAGACA TTCCTCCCAC ACTTTTCCCT TCCTCGCCCA AGAGAGCATC 2400 

10 CAGAAGGCAG TAGGACCTGG TTTTTCAGGT ACTGGGAGCC GGGGGCTCAC TGCTTGCACT 2460 

GTGCTTAGGG TAGGGATGGT AAATATCCTC CCTGCATGGC TTTATCCTCC CTCTCATCCC 2520 

^ AAAGCAGGTA TCTTCTGGTT GTCACAGAGT TTCATTGAGT CCAGCTGCAG CCACGTGGCC 2580 

ATCTGGAGCT GGTGCTATAG GTGACCATCT GGTACATTGA GGGGACCTGT TTGCCTCCTC 2640 

CACTCTATAA GCAGTCATCT TGGGAGACCG GGAGGAGAAG GTGGTGGGCT AGTCCTGTCT 2700 

20 CCTCCTCCAC rrCCCATGCC TCTATGTTAC CCATCTGTGT CTCCTGTGCA GAAGGAGAGG 2760 

AAGGGGCATT AAGAGATGAA GGGTGATTAT GTATTACTTA TCCATTTCTG AATAAACATT 2820 

TGTTATTCCT AAAAAAAAAA AAAAAAAACT CGAGGGGGGG CCCGGWACCC AWATCGCCSK 2880 
AAAGTGAG 



25 



30 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 62: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 1851 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 



60 



CTCCAAGAGC AAGAAGCCAA AGAAAGAAAA ACTAAAGATG ATGAAGGAGC AACTCCCATT 



2888 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 62: 
CACTAGTATA ATTTATAATT ATAACCTATT CTGATTTCTT TTCAAATATT AGGTGTCCTA 60 
GTTGCCTATG AAGGTTTGCC ACTTCATCTT CX^CTGTTCC CCAAACTTTG GACTGAGCTA 
45 TGCCAGACTC AGTCTGCTAT GTCAAAAAAC TGCATCAAGC TTTTGTGTGA AGATCCTGTT 



120 
180 



TTCGCAGAAT ATATTAAATG TATCCTAATG GATGAAAGAA CTTTTTTAAA CAACAACATT 240 

GTCTACACGT TCATGACACA TTTCCTTCTA AAGGTTCAAA GTCAAGTGTT TTCTGAAGCA 300 

AACTGTGCCA ATTTGATCAG CACTCTTATT ACAAACTTGA TAAGCCAGTA TCAGAACCTA 360 

CAGTCTGATT TCTCCAACCG AGTTGAAATT TCCAAAGCAA GTGCTTCTTT AAATGGGGAC 420 

55 CTGAGGGCAC TCGCTTTGCr CCTGTCAGTA CACACTCCCA AACAGTTAAA CCCAGCTCTA 480 

ATTCCAACTC TGCAAGAGCT TTTAAGCAAA TGCAGGACTT GTCTGCAACA GAGAAACTCA 540 
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AAAAGGCGGC GTGTTAGCAG TGATGAGGAG CACACTGTAG ACAGCTGCAT CAGTGACATG 660 

AAAACAGAAA CCAGGGAGGT CCTGACCCCA ACGAGCACTT CTGACAATGA GACCAGAGAC 720 

TCCTCAATTA TTGATCCAGG AACTGAGCAA GATCTTCCTT CCCCTGAAAA TAGTTCTGTT 780 

AAAGAATACC GAATGGAAGT TCCATCTTCG TTTTCAGAAG ACATGTCAAA TATCAGGTCA 840 

CAGCATGCAG AAGAACAGTC CAACAATGGT AGATATGACG ATTGTAAAGA ATTTAAAGAC 900 

CTCCACTGTT CCAAGGATTC TACCCTAGCC GAGGAAGAAT CTGAGTTCCC TTCTACTTCT 960 

ATCTCTGCAG TTCTGTCTGA CTTAGCTGAC TTGAGAAGCT GTGATGGCCA AGCTTTGCCC 1020 

TCCCAGGACC CTGAGGTTGC TTTATCTCTC AGTTGTGGCC ATTCCAGAGG ACTCTTTAGT 1080 

CATATGCAGC AACATGACAT TTTAGATACC CTGTGTAGGA CCATTGAATC TACAATCCAT 1140 

GTCGTCACAA GGATATCTGG CAAAGGAAAC CAAGCTGCTT CTTGACATTA GGTGTAGCAT 1200 

GTCTACTTTT AAGTCCCTCA CCCCCAACCC CCATGCTGTT TGTATAAGTT TTGCTTATTT 1260 

GTTTTTGTGC TTCAGTTTGT CCAGTGCTCT CTGCTTGAAT GGCAAGATAG ATTTATAGGC 1320 

TTAATTCTTG GTCAGGCAGA ACTCCAGATG AAAAAAACTT GCATCTTCAG TATACTTCCT 1380 

AAAGGGCAAT CAGATAATGG ATATGTTTTA TGTAATTAAG AGTTCACTTT AGTGGCTTTC 1440 

ATTTAATATG GCTGTCTGGG AAGAACAGGG TTGCCTAGCC CTGTACAATG TAATTTAAAC 1500 

TTACAGCATT TTTACTGTGT ATGATATGGT GTCCTCTGTG CCAGTTTTGT ACCTTATAGA 1560 

GGCAGATTGC CTCCGATCGC TGTGGTTCTT ATTATCAAAA TTAAGTTTAC TTGTATACGG 1620 

AACAACCACA AGAAATTTGA TTCTGTAAAG AATCCTCTTT AGCTGTGGCC TGGCAGTATA 1680 

TAAATGGTGC TTTATTTAAC AGAATACCTG TGGAGGAAAT AAAGCACACT TGATGTAAAA 1740 

ATAATTGTTT TA1TTTTATT GACATGACTG ATTGATTGCT ATTCTGTGCA CTTAATTAAA 1800 

CTGATTGTGA TGACTTWWAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA A 1851 



45 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3542 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

55 

TCCAATGCTG ATGAGCGTCT TCGCTGGCAG GCCAGCTCCT TGCCTGCTGA TGACCTTTGC 60 
ACAGAAAATG CCATCATGCT GAAACGATTC AATAGGTATC CGCTGATCAT TGACCCCTCT 120 
60 GGACAGGCCA CAGAATTCAT TATGAATGAA TATAAGGWTC GTAAGATCAC ACGGACCAGC 180 
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TTCCTGGATG ACGCCTTCAG AAAGAACTTA GAGAGTGCAC TGAGATTCGG TAACCCCCTT 240 

CTGGTCCAGG ATGTGGAAAG CTACGATCCA GTTTTGAACC CGGTGCTGAA CCGTGAAGTG 300 

CGGCGAACAG GGGGGAGAGT GCTGATCACT CTCGGGGACC AGGACATAGA CCTGTCGCCA 360 

TCGTTTGTCA TCTTCCTGTC CACCCGGGAT CCAACTGTCG AGTTCCCACC AGATCTCTGT 420 

TCCCGGGTTA CTTTTGTAAA CTTCACAGTT ACCCGTAGCA GTTTACAAAG CCAGTGTCTA 480 

AATGAAGTAC TTAAAGCAGA AAGACCTGAT GTGGACGAGA AACGATCTGA TCTTCTTAAA 540 

CTTCAAGGGG AATTTCAGCT CCGTTTGCGT CAGCTQGAAA AATCTCTACT ACAAGCTCTG 600 

AACGAGGTGA AAGGGCGCAT TTTGGATGAC GACACGATCA TAACCACTCT GGAGAACCTG 660 

AAGAGAGAGG CTGCAGAGGT CACCAGGAAA GTTGAGGAGA CGGACATTGT CATGCAGGAG 720 

GTGGAGACCG TGTCCCAGCA GTACCTCCCG CTCTCCACCG CCTGCAGCAG CATCTACTTC 780 

ACCATGGAGT CCCTCAAGCA GATACACTTC TTGTACCAGT ACTCCCTCCA GTTTTTCCTC 840 

GACATTTATC ACAACGTCCT ATACGAGAAC CCGAACCTGA AGGGTGTCAC CGACCACACA 900 

CAGCGCCTGT CCATTATAAC AAAGGACCTC TTCCAGGTGG CGTTTAACCG AGTGGCTCGA 960 

GGCATGCTGC ATCAGGACCA CATTACCTTT GCCATGCTGC TGGCAAGAAT CAAACTGAAG 1020 

GGCACCGTGG GGGAGCCCAC CTACGATGCA GAATTCCAGC ACTTCTTGAG AGGAAATGAG 1080 

ATTGTCCTGA GTGCTGGCTC CACCCCCAGG ATCCAGGGCC TGACTGTGGA GCAGGCGGAG 1140 

GCGGTGGTGA GGCTGAGCTG CCTTCCCGCG TTTAAGGACT TGATTGCAAA GGTTCAGGCA 1200 

GACGAGCAAT TTGGCATCTG GCTGGACAGC AGCTCCCCGG AGCAGACTGT GCCCTACCTC 1260 

TGGAGTGAAG AAACACCTGC AACACCCATT GGCCAGGCCA TCCACCGCCT GCTCCTGATC 1320 

CAGGCTTTCC GGCCCGATCG CCTGTTGGCC ATGGCCCACA TGTTTGTTTC AACAAACCTT 1380 

GGGGAGTCTT TCATGTCCAT CATGGAGCAG CCGCTCGACC TGACCCACAT TGTGGSCACA 1440 

GAGGTGAAGC CCAACACTCC TGTCTTAATG TGCTCTGTGC CTGGTTATGA TGCCAGTGGA 1500 

CATGTCGAGG ACCTTGCAGC CGAGCAGAAC ACGCAGATCA CTTCAATTGC AATCGGCTCT 1560 

GCAGAAGGCT TTAACCAAGC AGATAAGGCA ATAAACACCG CTGTAAAGTC GGGCAGOTGG 1620 

GTGATGCTGA AGAATGTGCA TCTGGCCCCA GGGTGGCTGA TGCAGCTGGA GAAGAAGTTG 1680 

CATTCCCTGC AGCCGCATGC CTGCTTCCGA CTCTTCCTCA CCATGGAGAT CAACCCCAAG 1740 

GTGCCTGTGA ATCTGCTCCG TGCGGGCCGC ATCTTTGTGT TCGAGCCACC GCCAGGGKTG 1800 

AAGGCCAACA TGCTGAGGAC GTTCAGCAGC ATTCCCGTCT CACGGATATG CAAGTCTCCC I860 

AACGAGCGTG CCCGCTTGTA CTTCCTGCTG GCCTGGTTTC ATGCGATCAT CCAAGAACGC 1920 

TTACGATACG CACCACTGGG GTGGTCAAAG AAGTATGAAT TTGGAGAGTC TCACCTGCGG 1980 



WO 98/39448 



287 



PCT/US98/04493 



TCANYTTGCG ATACGGTGGA CACGTGGCTG GATGACACGG CCAAGGGCAG GCAGAACATC 2040 
5 TCACCGGATA AGATCCCGTG GTCTGCACTA AAGACCTTAA TGGCCCAGTC CATTTATGGC 2100 
GGGCGCGTGG ACAACGAGTT TGACCAGCGT CTGCTCAACA CCTTCCTGGA GCGCCTGTTC 2160 
ACAACCAGGA GTTTCGACAG TGAGTTTAAG CTGGCATGCA AGGTCGACGG ACATAAAGAC 2220 
10 ATTCAAATGC CAGATGGCAT GCAGGCGAGA GGAGTTTGTG CAGTGGGTGG AGTTGCTCCC 2280 
CGACACCCAG ACGCCCTCCT GGCTGGGCCT GCCCAACAAC GCCGAGAGAG TCCTCCTTAC 2340 
15 CACACAGGGT GTGGACATGA TCAGTAAAAT GCTGAAGATG CAGATGTTGG AGGATGAGGA 2400 
CGACCTGGCC TACGCAGAGA CTGAGAAGAA GACGAGGACA GACTCCACGT CCGACGGGCG 2460 

CCCTGCCTGG ATGCGGACAC TGCACACCAC CGCGTCCAAC TGGCTGCACC TCATCCCCCA 2520 

20 GACGCTGAGC CACCTCAAGC GCACCGTGGA GAATATCAAG GATCCTTTGT TCAGGTTCTT 2580 

TGAGAGAGAA GTGAAGATGG GCGCAAAGCT GCTTCAGGAC GTTCGCCAGG ACCTTGCAGA 2640 

25 TGTCGTCCAG GTCTGCGAAG GAAAGAAGAA GCAGACCAAC TACTTGCGCA CGCTGATCAA 2700 

CGAGCTAGTG AAAGGGATCT TGCCTCGGAG CTGGTCCCAC TACACGCTGC CTGCCGGCAT 2760 

GACCGTCATC CAGTGGGTGT CCGACTTCAG CGAGAGGATC AAACAGCTGC AGAACATCTC 2820 

30 ACTGGCAGCT GCATCTGGTG GCGCCAAGGA GCTAAAGAAC ATCCACGTGT GCCTGGGTGG 2880 

CCTGTTCGTG CCTGAGGCGT ACATCACTGC CACCAGGCAG TATGTGGCCC AGGCCAACAG 2940 

35 CTGGTCCCTC GAGGAGCTCT GCCTGGAAGT CAACGTCACC ACCTCACAGG GCGCCACCCT 3000 

TGACGCTTGC AGCTTCGGAG TCACGGGTTT GAAACTTCAA GGGGCCACGT GCAACAACAA 3060 

CAAGCTGTCA CTGTCCAATG CCATCTCAAC CGCCCTTCCC CTGACGCAGC TGCGCTGGGT 3120 

40 CAAGCAGACA AACACCGAGA AGAAGGCCAG TGTGGTAACC TTACCTGTCT ACCTGAACTT 3180 

CACCCGTGCA GACCTCATCT TCACCGTGGA CTTCGAAATT GCTACAAAGG AGGATCCTCG 3240 

45 CAGCTTCTAC GAGCGGGGTG TCGCAGTCTT GTGCACAGAG TAAACTTTTC TAGCTGCCCC 3300 

TrrCTGTAAT AGTGAAAGTT GGTATTTAAC ATTTATTCAT 1TTTAAAATA TTTGGAAGGT 3360 

CTGAGCTTGT GAAAAGAAAG TGGTTGGTCT GAGGTTGGAG GAAGCTGAAT GGAATCTGAC 3420 

50 GGTTGGGAGT GGTGGAAATT GGAAGGATAC CAGGAGGTAT TTGOGAAGGC CAATGGCGTC 3480 

GCTCCTTTGA GGAAATAAAA CACTAAGCAT GAAAAAAAAA AAAAAACTTA CAANCCNCAA 3540 



60 



(2) INFORMATION FOR SEQ ID NO: 64 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 883 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
5 (D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

1Q AGGTGATTTT AATGATAGGT GTCATATATA GGACGGATAA TCTGTTTACA TTCTGTTCTT 6( 

CTCGATGCAC TCACAAGCGG GTAACTAGGT GACAAGAAAA CAAAGATCTT ATTCAAAAGA 12C 

GGTCTTACAG CAACCCAACG TCTCATCTTC CCATAGTAAA GATGACGGCG CCTTGAGGTA 18C 

15 AGCTACAGGC AACACCACTT CCGCGTTTCT CTTGCGCCCT GGTCCAAGAT GGCGGATGAA 240 
GCCACGCGAC GTGTTGTGTC TGAGATCCCG GTGCTGAAGA CTAACGCCGG ACCCCGAGAT 300 

20 CGTGAGTTGT GGGTGCAGCG ACTGAAGGAG GAATATCAGT CCCTTATCCG GTATGTCGAG 360 
AACAACAAGA ATGCTGACAA CGATTGGTTC CGACTGGAGT CCAACAAGGA AGGAACTCGG 420 
TGGTTTGGAA AATGCTGGTA TATCCATGAC CTCCTGAAAT ATGAGTTTGA CATCGAGTTT 

25 GACATTCCTA TCACATATCC TACTACTGCC CCAGAAATTG CAGTTCCTGA GCTCGATCGA 
AAGACAGCAA AGATGTACAG GGGTGGCAAA ATATGCCTGA CGGATCATTT CAAACCTTTG 



480 
540 



30 



35 ATACTAWTTT CCTGTGCATC ACACTTAACT CATCTAACTG TTCCCCGGAC ANCCTCCACT 
CTAGTTGTTA CTAAGTANTG CAGTAGCATT NTGGGGAAGA ACA 



40 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1541 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



600 



TGGGGCCAGG AATGTGCCCA AATTTGGACT AGCTCATCTC ATGGCTCTGG GGCTGGGTCC 660 
ATGGSTGGCA GTGGAAATCC CTGATCTGAT TCAGAAGGGC GTCATCCAAC ACAAAGAGAA 720 
ATGCAACCAA TGAAGAATCA AGCCACTGAG GCAGGGCAGA GGGACCTTTG ATAGGCTACG 780 



840 
883 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
GGCACGAGGT GGCCTCTACC CTGGGCTCAT CTGGCTACAC AGGGACTCTA AACGCTTCCA 
55 GATTCCCTGG AAACATGCCA CCCGGCATAG CCCTCAACAA GAAGAGGAAA ATACCATTTT 120 
TAAGGCCTGG GCTGTAGAGA CAGGGAAGTA CCAGGAAGGG GTGGATGACC CTGACCCAGC 



180 



TAAATGGAAG GCCCAGCTGC GCTGTGCTCT CAATAAGAGC AGAGAATTCA ACCTGATGTA 240 



60 TGATGGCACC AAGGAGGTGC CCATGAACCC AGTGAAGATA TATCAAGTGT GTGACATCCC 
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TCAGCCCCAG GGCTCGATCA TTAACCCAGG ATCCACAGGG TCTGCTCCCT GGGATGAGAA 360 

GGATAATGAT GTGGATGAAG AAGATGAGGA AGATGAGCTG GATCAGTCGC AGCACCATGT 420 

TCCCATCCAG GACACCTTCC CCTTCCTGAA CATCAATGGT TCTCCCATGG CGCCAGCCAG 480 

TGTGGGCAAT TGCAGTGTGG GCAACTGCAG CCCGGAGGCA GTGTGGCCCA AAACTGAACC 540 

CCTGGAGATG GAAGTACCCC AGGCACCTAT ACAGCCCTTC TATAGCTCTC CAGAACTGTG 600 

GATCAGCTCT CTCCCAATGA CTGACCTGGA CATCAAGTTT CAGTACCGTG GGAAGGAGTA 660 

CGGGCAGACC ATGACCGTGA GCAACCCTCA GGGCTGCCGA CTCTTCTATG GGGACCTGGG 720 

TCCCATGCCT GACCAGGAGG AGCTCTTTGG TCCCGTCAGN CTGGAGCAGG TCAAATTCCC 780 

AGGTCCTGAG CATATTACCA ATGAGAAGCA GAAGCTGTTC ACTAGCAAGC TGCTGGACGT 840 

CATGGACAGA GGACTGATCC TGGAGGTCAG CGGTCATGCC ATTTATGCCA TCAGGCTGTG 900 

CCAGTGCAAG GTGTACTGGT CTGGGCCATG TGCCCCATCA CTTGTTGCTC CCAACCTGAT 960 

TGAGAGACAA AAGAAGGTCA AGCTATTTTG TCTGGAAACA TTCCTTAGCG ATCTCATTGC 1020 

CCACCAGAAA GGACAGATAG AGAAGCAGCC ACCGTTTGAG ATCTACTTAT GCTTTGGGGA 1080 

AGAATGGCCA GATGGGAAAC CATTGGAAAG GAAACTCATC TTGGTTCAGG TCATTCCAGT 1140 

AG1GGCTCGG ATGATCTACG AGATGTTTTC TGGTGATTTC ACACGATCCT TTGATAGTGG 1200 

CAGTGTCCGC CTGCAGATCT CAACCCCAGA CATCAAGGAT AACATCGTTG CTCAGCTGAA 1260 

GCAGCTGTAC CGCATCCTTC AAACCCAGGA GAGCTGGCAG CCCATGCAGC CCACCCCCAG 1320 

CATGCAACTG CCCCCTGCCC TGCCTCCCCA GTAATTGTGA ATGCCATCTT CTTCCTTCTC 1380 

TTTTTTATAA TATTGTACAT ATGGATTTTT TTATTGTTTA GATTTAACCA GCTTTTAAAT 1440 

CTCTGTTTTC TGTGACAGTG TTAGAAGTTT GTGATTCTCC AAATATGCCT AGATTTAAAG 1500 

CTGATTTAAT TTATGGAAAA AAAAAAAAAA AAAAAAAAAA A 1541 



45 



50 



55 
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(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 732 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
AGAAAATGAA TGTTAGAAGG TGCCTGCCGA GGCGGGACAG AGTGTTTGCT CGCGCTGGAG 
AAGGCTCTGC TCAGCCCTGA GAGTCCCTTC CTGCCCCACC GATACTGGCA CTTTAAAAAG 



60 
120 
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GAAGCTGACC GCACAGTGTC CAGACGAATT GGCCCCCAGA AGATGGGGAG TTCTGTCCTG 180 

CCCTTCTGTG TCTGCGTGAC CTCACCCAGC CTAGGAGGGA GGTGCATTCA GGGTAGATTT 240 

GCCTCTCATT CAAAGTTCTG GGGCTTTGGG CGGAAAACAG CCAGCTTTGG CGCTGTTGGG 300 

GAGACTCCTC CAGACCAGGA ACCCCAGAAG GAGACAGAGC CTGCCACATC CTCCCACGCC 360 

AGGCCCTGGG CCAGGGTCAT TGGACTGAGA ATTTGGCCAC AACCAAATTG ATGCTGGCTG 420 

GAACCAGAGG CCAGAAAGCC TGGCCTTGTC CCCATGTGGG AGCCCTGTCC TCAGCCCTCT 480 

TGTCCCCTTG AGCTCAGTGA ATTCCCACCA GGTGCCCACA GCTCCTGGAC TTCAAATTCT 540 

15 ATATATTGAG AGAGTTGGAG AGTATATCAG AGATATTTTT GGAAAGGAGT TGGTCTATGC 600 

AATGTCAGTT TGGAATCTTC TTGAAAGTTT AATGTTTTTA TTAGGAGATT TAAAGAAAAT 660 

AAAGGTCTAC AATATCAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 720 

AAAAAAAAAA AA 732 



20 



25 



35 



55 



{ 2 ) INFORMATION FOR SEQ ID NO: 67 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 629 base pairs 
30 (B) TYPE: nucleic acid 

{C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

TTAAGGAATT CGGCMCGATC CCGGCAAGTA ACATGACTAA AAAGAAGCGG GAGAATCTGG 60 

GCGTCGCTCT AGAGATCGAT GGGCTAGAGG AGAAGCTGTC CCAGTGTCGG AGAGACCTGG 120 

40 AGGCCGTGAA CTCCAGACTC CACAGCCGGG AGCTGAGCCC AGAGGCCAGG AGGTCCCTGG 180 

AGAAGGAGAA AAACAGCCTA ATGAACAAAG CCTCCAACTA CGAGAAGGAA CTGAAGTTTC 240 

TTCGGCAAGA GAACCGGAAG AACATGCTGC TCTCTGTGGC CATCTTTATC CTCCTGACGC 300 

45 

TCGTCTATGC CTACTGGACC ATGTGAGCCT GGCACTTCCC CACAACCAGC ACAGGCTTCC 360 

ACTTGGCCCC TTGGTCAGGA TCAAGCAGGC ACTTCAAGCC TCAATAGGAC CAAGGTGCTG 420 

50 GGGTGTTCCC CTCCCAACCT AGTGTTCAAG CATGGCTTCC TGGCGGCCCA GGCCTTGCCT 480 

CCCTGGCCTG CTGGGGGGTT CCGGGTCTCC AGAAGGACAT GGTGCTGGTC CCTCCCTTAG 540 

CCCAAGGGAG AGGCAATAAA GAACACAAAG CTGAAAAAAA AAAAAAAAAA AACTCGTAGG 600 

GGGGGCCCGT ACCCAATCGC CCTNTCGTG 629 



60 
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(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1751 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

CTGCTAGCCG GCCGGCGCAG GCTGCCGAGC GGGTGAGCGC GCAGGCCAGG CCAAAGCCCT 60 

GGTACCCGCG CGGTGCGGGC CTCAGTCTGC GGCCATGGGG GCGTCCGCGC GGCTGCTGCG 120 

AGCGGTGATC ATGGGGGCCC CGGGCTCGGG CAAGGGCACC GTGTCGTCGC GCATCACTAC 180 

ACACTTCGAG CTGAAGCACC TCTCCAGCGG GGACCTGCTC CGGGACAACA TGCTGCGGGG 240 

CACAGAAATT GGCGTGTTAG CCAAGGCTTT CATTGACCAA GGGAAACTCA TCCCAGATGA 300 

TGTCATGACT CGGCTGGCCC TTCATGAGCT GAAAAATCTC ACCCAGTATA GCTGGCTGTT 360 

GGATGGTTTT CCAAGGACAC TTCCACAGCC AGAAGCCCTA GATAGAGCTT ATCAGATCGA 420 

CACAGTGATT AACCTGAATG TGCCCTTTGA GGTCATTAAA CAACGCCTTA CTGCTCGCTG 480 

GATTCATCCC GCCAGTGGCC GAGTCTATAA CATTGAATTC AACCCTCCCA AAACTGTGGG 540 

CATTGATGAC CTGACTGGGG AGCCTCTCAT TCAGCGTGAG GATGATAAAC CAGAGACGGT 600 

TATCAAGAGA CTAAAGGCTT ATGAAGACCA AACAAAGCCA GTCCTGGAAT ATTACCAGAA 660 

AAAAGGGGTG CTGGAAACAT TCTCCGGAAC AGAAACCAAC AAGATTTGGC CCTATGTATA 720 

TGCTTTCCTA CAAACTAAAG TTCCACAAAG AAGCCAGAAA GCTTCAGTTA CTCCATGAGG 780 

AGAAATGTGT GTAACTATTA ATAGTAAGAT GGGCAAACCT CCTAGTCCTT GCATTTAGAA 840 

GCTGCTTTTC CTAAGACTTC TAGTATGTAT GAATTCTTTG AAAATTATAT TACTTTTATT 900 

TCTACTGATT TTATTTTGGA TACTAAGGAT GTGCCAAATG ATTCGGATAC TAAGATGCAT 960 

CGTTTGAAAT CATCTAGTGT GTTGTATGCA GTTATCCTCA AAAACATCAG CGATGTCTGA 1020 

ACCTTTAAAA CATCTGTTAG AGCAAAATTA AAAGAGCATT TGGTAGTAAT CTAACTTTTT 1080 

GTTCAGTTAA TAAGTGGTTG ATAAAGTTTC CATATTTTTC TGGAAAAGTT AAAAAAAGTT 1140 

ACATGTCATT TGGAGAAAAT ACGTAATCAG AAATTTGTGC ATAGATTGAT GCCAAAAAAG 1200 

ACATTTCCAG CATTGTGGAA CATGGTGAGA CACTATATAA AATTCCAGAA AGAAAGCAAC 1260 

TGGATTTACA GATTTATTGT GAGACACAAA TTCACTGCTG CCTTTACACT AAGAAATGTA 1320 

TATGTTAACC ATATATGCTG TATTTATTTT GTCGTTAAGC ATACTTTCAG TTTACTCAGA 1380 

ATTTTCAATT TGCTATAAAG ATGTATCAAT TAGCATATAG AAAAATATTA CTTTAAGATG 1440 

ACTTGTTTCC TTTGAAAATA CCTGTGTACT GAGGGTTATG ATTTGTGTCA AAAATTGACA 1500 
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TAAGTGCTTT TACAAGCACC AAAGTTGAAT GAATTTTCAA CAAAATGTAA TTAAAGTCTA 



1560 



TGTTTTCAGT TATGACTCAG GTTAAGAAAT GTGTTTTAGG ATCTACTTGC TGGTTTTTCT 



1620 



TTTTGATCCA AATGTGTGAT CTGCCCTGAT AAATAACAAG TTATNGTACC ATCTCCCCCG 



1680 



CCAATAAAAA AAAAAAAAAA AAAAAAAAAC TCGAGGGGGG GCCCGGTACC CAATTCTCCG 
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1751 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 508 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
<D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

GGCACGAGAT TATGTATTAA AATGTTTTTG AATTGTGAAA TATTAGAATA TTGITACTAT 60 

TTGACCCAAC TCAAAATCTC CATGGGAAAA TACCTGTCGA TACCCACAGT ATTGTTGAAA 120 

ATAATCAGAT GCAGTATCAC AGCTGTGTCA GACTCTAGTA CCAGTTGGGC AATCAAGGCA 180 

CAGCTAAAAA TTGAAAACAA AGATCTGGAC AACAAAACAG CCAAAGGTGG GGGTCAAGAA 240 

GCTCTGACG1' GTACCTAGCT GTAGAATGCT ATGCACACGT GCCAGGTGTA GTGTGCATAT 300 

CCAGGAAAAA CTGCAGAGAG CCCCAGTCTT CACCTCTGGT TGACCATGAG CTCTGTGTAA 360 

GCAGGAAGTG AAGGCTAAGG CAGATTTAAG CTCTGAAAGC ATTCCACAAC ATACACACAA 420 

ATCGTGCAAA GCATTAAGGA AATCTTGTTA CTGCTAAGTG TTGCTGACCC AGGAACAACT 480 

CCTACTCAGC TGGACTTAAA AATAAAAA 508 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 245 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

TACATAGAGC AAAGAGAAAT TTCCAGAATT TCTARAATTC TGGAAAGAGA ATTTTCCTGA 60 

GATTGCAGAT TTGCTTGTGT CCTCAGGTGA TGATGAGGGC TGTTTTCCCC TCTTGTCCTT 120 

TCCTCACACT CATGCTTCCT CTCCTAGAGT GTCTGGTTGG CATGATCATG TGCTACCTAG 180 
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GCATTTCTTT CACTGATACA AGGAAAACTG CAGGGTTAAA AAAAAAAAAA AAAAAAAAAA 240 
NCNCG 245 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 361 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

ATGTTCCTCA TGAGGATGCA CTTGTGCTTC TGCAAGTATT GCTGCAGCTT CATAGTGACT 60 

CCCACCAGCA CCAGCAATAC AGCTAGCTAC CTGT3GCCTT GGATCTCAGC CAGCATGGCT 120 

GGGAGAGGGA GCAGCTGGGC ATGTACCCTA AATGCTGTTA CCAGGGAAGG ACTCCCAGAG 180 

TGAAGACAAG TAGGGACTTC CTGCAGAGGT GGTACATGTG CTCTCTGTAT CCATACTTTT 240 

TTTTTTTTTT TTTTGAGATA GAGTTTCACC CTTGTTGCCC TGGCTGGAGT GCAATGGTGC 300 

GATCTCAGCT CACTGCAACC TCTCTGCCTC CCGGGTTCAA GTGATTCTCC TCCCTCAGCC 360 
T 

361 



35 (2) INFORMATION FOR SEQ ID NO: 72; 



40 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 713 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

45 AGGATCACAC AATAGAGAAC ACTGTAGTAA CATTTCGGTC TGCTCACAAG ACCCAGAACA 60 

TTGATCAGTT TTTGTTGTTG GTTTATTATT TTTCTGTTAA AAAATTGTGA AAAGTTTGTT 120 

^ TTAGCTAGAT GATATTTTAA TAGCTGCGAG TGCTTTGGAA CTATAAAGAT GTCACTACTT 180 

AACACACATA CCTTATGTTT TGTTTTGTTT TGTTTTACAC TCAGTATAAA TCAGGAGAAG 240 

TTAGCCAACC ATCTAGCATT TAGAATCCTC TTTTTTATTG TCTTCTAAGG ATATGGATGT 300 

55 TCCCATAACA GCAACAAAAC AGCAACAAAA ACATTTCATA AATATCACTT GATAGACTGT 360 

AAGCACCTGC TTAACTTTGT GTCCCAAATA TTTAGTGTGT ATATATATAT ATATATATAC 420 

ACACACACAC ACATATATAT TCAACAAATA AAGCAAAATA TAACATGCAT TTCACATTTT 480 

60 
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GTCTTTCCCT GTTACGATTT TAATAGCAGA ACTGTATGAC AAGTTTAGGT GATCCTAGCA 540 

TATGTTAAAT TCAAATTAAT GTAAAACAGA TV AACAACAA CAAAGAAACT GTCTATTTGA 600 

5 GTGAAGTCAT GCTTTCTATT ATAATAACTT GGCTTCGGTT ATCCATCAAA TGCACACTTA 660 

TACTGTTATC TGATTGTTTA TAATAAAGAA TACTGTACTT ATAAAAAAAA AAA 713 



10 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 862 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

GAAAGTCAGA GCTGTCCAAT CCCTCAGCAC CTTTTAGATT TGCTCCAAAT TAGAAACGTG 60 

GGGACTATGT GTTCTGGGCA ATCACAGGTC TGGAAAATGG CTCTGCAGGC TCTTGATAGT 120 

25 

GAGACAGTGG TCATCTTACC AGACATGCAT CTGATTTTAA GCCTCAGGCT AATCCACAAT 180 

GCTCGGCCAT GCCTATGATT AACAAACAAA AGCAAAATCT GCTTTTATAG TTTAGGAAAC 240 

30 CTGGATAGAA CAGTATTTTT CAGCATTCTT GGATAAAGCA GTTCTGCATT TTTAAATTGG 300 

GACTGCAGAA GTGACTGTCT ATAGTTGTGA AATACAAAAA ATGGTATGTT TGATCAGAAA 360 

AGGAAGCCCG TGCCTGGCAC TTGGAAAGAT ACTGAGCATC ATAACCCTAA TGAGAAAATG 420 

35 

TAGGCTCTGT GAATGTTAAC TACAAATCAG GTTAGGAAAG CATATGACAC CCTTTGTCAA 480 

ACTAAGCTTC ACTAGGAGGA CCTGTGCTCA TAGAAGAATA TGCTTTAAAA GTATCAATTT S40 

40 TCCACAGTCG ATGATGGAGA AAAGTTCATT TGCACCAGAA TGCTGATAGT CACAATACAC 600 

AGCCTGACAT ATATAACAAT ACAGTTTTCT GTAAACAGAA GTTCTTCCTC TTCCAATTCA 660 

GGAGTCAGTC AGAGCATAAA TATTGCATGT TTCACTTTAG AAACTGATTC ATTTTAGAAA 720 

45 

GCAGATCTGG ATTATTTTGC AGGGTAGAAA TGAAGGCTAT TTCTGGCATT CTTGCTCAAA 780 

AAGTCAATAT ATGTACATTA AGTATAAAAA AGGGTCTCTT TCACCTCTTT TGTTTCGTAG 840 

50 CATTGGCTAC ATAACTCGTG CC 862 



55 (2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS : 

{A) LENGTH: 4602 base pairs 
(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS: double 
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(D) TOPOLOGY: linear 
<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
CCGAGGGGGC GKGGGGAGCA GCGCCGARGC CGCCGCCTCC GCCTCCGCCG CCTAGGACTA 
GGGGGTGGGG GACGGACAAG CCCCGATGCC GGGGGAKACG GAAGAGCCGA GACCCCCGGA 



60 
120 



^ GCAGCAGGAC CAGGAAGGGG GAGAGGCGGC CAAGGCGGCT CCGGAGGACC CGCAACAACG 180 

GCCCCCTGAG GCGGTCGCGG CGGCGCCTGC AGGGACCACT AGCAGCCGCG TGCTGAGGGG 240 

AGGTCGGGAC CGAGGCCGGG CCGCTGCGRC CGCCGCGCMG CAGCTGTGTC CCGCCGGAGA 300 

15 AGGCCGAGTA TCCCCGCCGG CGAGGAGCAG CCCCAGCGCC AGGCCTCCCG ACGTCCCCGG 360 

GCAGCAGCCC AGGCCGCGAA GTCCCCGTCT CCAGTTCAGG GCAAGAAGAG TCCGCGACTC 420 

CTATGCATAG AAAAAGTAAC AACTGATAAA GATCCCAAGG AAGAAAAAGA GGAAGAAGAC 480 

20 

GATTCTGCCC TCCCTCAGGA AGTTTCCATT GCTGCATCTA GACCTAGCCG GGGCTGGCGT 540 

AGTAGTAGGA CATCTGTTTC TCGCCATCGT GATACAGAGA ACACCCGAAG CTCTCGGTCC 600 

25 AAGACCGGTT CATTGCAGCT CATTTGCAAG TCAGAACCAA ATACAGACCA ACTTGATTAT 660 

GATGTTGGAG AAGAGCATCA GTCTCCAGGT GGCATTAGTA GTGAAGAGGA AGAGGAGGAG 720 

^ GAAGAAGAGA TGTTAATCAG TGAAGAGGAG ATACCATTCA AAGATGATCC AAGAGATGAG 780 

ACCTACAAAC CCCACTTAGA AAGGGAAACC CCAAAGCCAC GGAGAAAATC AGGGAAGGTA 840 

AAAGAAGAGA AGGAGAAGAA GGAAATTAAA GTGGAAGTAG AGGTGGAGGT GAAAGAAGAG 900 

35 GAGAATGAAA TTAGAGAGGA TGAGGAACCT CCAAGGAAGA GAGGAAGAAG ACGAAAAGAT 960 

GACAAAAGTTC CACGTTTACC CAAAAGGAGA AAAAAGCCTC CAATCCAGTA TGTCCGTTGT 1020 

GAGATGGAAG GATGTGGAAC TGTCCTTGCC CATCCTCGCT ATTTGCAGCA CCACATTAAA 1080 

40 

TACCAGCATT TGCTGAAGAA GAAATATGTA TGTCCCCATC CCTCCTGTGG ACGACTCTTC 1140 

AGGCTTCAGA AGCAACTTCT GCGACATGCC AAACATCATA CAGATCAAAG GGATTATATC 1200 

45 TGTGAATATT GTGCTCGGGC CTTCAAGAGT TCCCACAATC TGGCAGTGCA CCGGATGATT 1260 

CACACTGGCG AGAAGCATTA CAATGTGAGA TCTGTGGATT TACTTGTCGA CAAAAGGCAT 1320 

^ CTCTTAATTG GCACATGAAG AAACATGATG CAGACTCCTT CTACCAGTTT TCTTGCAATA 1380 

TCTGTGGCAA AAAATTTGAG AAGAAGGACA GCGTAGTGGC ACACAAGGCA AAAAGCCACC 1440 

CTGAGGTGCT GATTGCAGAA GCTCTGGCTG CCAATGCAGG CGCCCTCATC ACCAGCACAG 1500 

55 ATATCTTGGG CACTAACCCA GAGTCCCTGA CGCAGCCTTC AGATGGTCAG GGTCTTCCTC 1560 

TTCTTCCTGA GCCCTTGGGA AACTCAACCT CTGGAGAGTG CCTACTGTTA GAAGCTGAAG 1620 

GGATGTCAAA GTCATACTGC AGTGGGACGG AACGGGTGAG CCTGATGGCT GATGGGAAGA 1680 

60 
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TCTTTGTGGG AAGCGGCAGC AGTGGAGGCA CTGAAGGGCT GGTTATGAAC TCAGATATAC 1740 

TCGGTGCTAC CACAGAGGTT CTGATTGAAG ATTCAGACTC TGCCGGACCT TAGTGGACAG 1800 

GAAGACTTGG GGCATGGGAC AGCTCAGACT TTGTATTTAA AAGTTAAAAA GGACAAAAAA 1860 

AAAATCTAAA GCATTTAAAA TCTAGTGAAA TAACTGAAGG GCCTGCTCTT TCCATTGTGG 1920 

ATCACAGCAC ACACATACAT ACACCCTCCA CCTCCCCATC CCCTGTTCTC CCTCTGTTGC 1980 

TCCCCTTATA AAATTGATGT TGTCTTTACC AGAAAGGTAG ACAAAAAAGA AGCAGCAGCA 2040 

GCTCTTAAAG TGAGGGTTAT TCTCATACTC GGTTCCAGCC ATCAGCAGAC TTCCTGCTCA 2100 

TCGGCAGATC CCCCITTCCA ACCTGTAACT CTGATGTGCT CTGGATCAGC TTTTAACTTT 2160 

TAATCATATA TTACTGTCTT CTAAATCCCT TCTCCTCCTC TACTGCTGCC CTATGGTTCT 2220 

GGCTCCTACC CCCTGCGGCA CACTTATCTT CAAATACCAT AGAATTCTAA TCTCTGAAAT 2280 

CATAGCTCTC CAGTGGCTTT TAAAGAAAGC TGGTCCTCAG CACTAACAAA ATCACTACAA 2340 

T AGO CTAGTG CTTTTTTGGA AGCCTTTTTA GGGAAGAATG TTAGGTTCAT GGTAACTAGT 2400 

ATGCTCTTTG AGATTTTTAC AGTGTTGAAA CTTAAGAATT TTGAGAGGGT GAGGAGGGTT 2460 

GTTCAGAATC TAAATTACAG ATAGATGATT GTTTCTTGTG AATTTGTTTC TTTTCCTTTT 2520 

TTTTTGTCCC TACCATTTCC TTACATTTCC CTTGGGGCCC ATCTCTGGCT CCTTGCITIT 2580 

TGTTTCTTGC TTTGCTTTAT CAGTTCATTC CAGCTCCCTG TTAGTGAAGG ACACTGCTGT 2640 

TAGTGAAGGA ACAAAGTCTA TGAGTCCTAA AATTTTAAGT CAAAGAAAAC TGCTCTGTTT 2700 

CCCCTTTAGT AACACTTCTG AAGAGGAAAA ACTTCAATAG CCAAAGTTAA TAATCCTATA 2760 

TAATAATTGC TTTGGCTTTC ACCTAAAATT CTCGGCATCA CAATTTCCTT GGGATAGAGG 2820 

TTGTGTTGGG GAATAGATTG CTTATTGCTG TTCACTGGAG AGAAAAGGTA GTGTTTTTGT 2880 

ACAAGGTCAT ACCGCCAGAA GCCCCAAATC CTATTTTGGC TCATCTTCAG GTAAAGAGTA 2940 

ATTCCTATCC TGTGTGCCTC AGAAGCTAGA ATCGAAGGCT TACCCTATTC ATTGTTTATT 3000 

GTCAGAAATG CATGATGGCT CTTGGAAAGA ATGACGTTTT GCTGGAAAAA AAAAAAARAA 3060 

CMGTTTGTGT TTCACAAACA TGGCTTATCA ATTTTTTCAA AGAATTCTTT TTTCCCAAAA 3120 

AGAGGAGTAA CAAAATGTCA TTTCTGAAAG AGGCTTACTT TATACCAACT AGTGTCAGCA 3180 

TTTGGGATGC CAGGGAACAG AGAGTGAGAC ACCTACAATC ACCAGTCTCA AATGCGCTAT 3240 

TGTTTCTTTT CAGAGTGTTG CAGATTTGCC ATTTCTCCAT AATATGGGGA TAGAAAATGG 3300 

AATAAAGATA GAAGGGATGT AGAATATGCT TTCCTGCCAA CATGGTITGG AGTCGACTTT 3360 

GGTATATTGA CTAGATTTGA AAATACAAGA TTGATTAGAT GAATCTACAA AAAAGTTGTC 3420 

CTCCTCTCAG GTCCCTTTTA CACTTTTTGA CTAACTAGCA TCTATATTCC ACACTTAGCT 3480 
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TTTTTGTCAC ACTTATCCTT TGTCTCCGTA AATTTCATTT GCAGTGGTTA GTCATCAGAT 3540 

ATTTTAGCCA CCTACACAAA AGCAAACTGC ATTTTTAAAA ATCTTTCTGA GATGGGAGAA 3600 

AATGTATTCT CCTTTCCTAT ACCGCTCTCC CAACAAAAAA ACAACTAGTT AGTTCTACTA 3660 

ATTAGAAACT TGCTGTACTT TTTCTTTTCT TTTAGGGGTC AAGGACCCTC TTTATAGCTA 3720 

CCATTTGCCT ACAATAAATT ATTGCAGCAG TTTGCAATAC TAAAATATTT TTTATAGACT 3780 

TTATATTTTT CCTTTTGATA AAGGGATGCT GCATAGTAGA GTTGGTGTAA TTAAACTATC 3840 

TCAGCCGTTT CCCTGCTTTC CCTTCTGCTC CATATGCCTC ATTGTCCTTC CAGGGAGCTC 3900 

15 TTTTAATCTT AAAGTTCTAC ATTTCATGCT CTTAGTCAAA TTCTGTTACC TTTTTAATAA 3 960 

CTCTTCCCAC TGCATATTTC CATCTTGAAT TGGTGGTTCT AAATTCTGAA ACTGTAGTTG 4020 

AGATACAGCT ATTTAATATT TCTGGGAGAT GTGCATCCCT CTTCTTTGTG GTTGCCCAAG 4080 

20 

GTTGTTTTGC GTAACTGAGA CTCCTTGATA TGCITCAGAG AATTTAGGCA AACACTGGCC 4140 

ATGGCCGTGG GAGTACTGGG AGTAAAATAA AAATATCGAG GTATAGACTA GCATCCACAT 4200 

25 AGAGCACTTG AACCTCCTTT GTACCTGTTT GGGGAAAAAG TATAATGAGT GTACTACCAA 4260 

TCTAACTAAG ATTATTATAG TCTGGTTGTT TGAAATACCA TTTTTTTCTC CTTTTGTGTT 4320 

TTTCCCACTT TC C AATGT AC TCAAGAAAAT TGAACAAATG TAATGGATCA ATTTAAAATA 4380 

30 

TTTTATTTCT TAAAAGCCTT TTTTGCCTGT TGTAATGTGC AGGACCCTTC TCCTTTCATG 4440 

GGAGAGACAG GTAGTTAGCT GAATATAGGT TGAAAAGGTT ATGTAAAAAG AAATTATAAT 4500 

35 AAAAGGGATA CTTTGCTTTT CAAATCTTTG TTTTCTCTTA TTCTAGGTAA GGCATATTAA 4560 

AAATAAATAT GTAAAGAAGA AAAATAAAAG TTGTCTTCAT GG 4602 



40 



(2) INFORMATION FOR SEQ ID NO: 75: 



(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 1255 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

CGCGCCCCGG GCCGGCGGGT TTCTCTAACA AATAAACAGA ACCCGCACTG CCCAGGCGAG 60 

CGTTGCCACT TTCAAAGTGG TCCCCTGGGG GAGCTCAGCC TCATCCTGAT GATGCTGCCA 120 

55 

AGGCGCACTT TTTATTTTTA TTTTATTTTT ATTTTTTTTT TAGCATCCTT TTGGGGCTTC 180 

ACTCTCAGAG CCAGTTTTTA AGGGACACCA GAGCCGCAGC CTGCTCTGAT TCTATGGCTT 240 

60 GGTTGTTACT ATAAGAGTAA TTGCCTAACT TGATTTTTCA TCTCTTTAAC CAAACTTGTG 300 
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GCCAAAAGAT ATTTGACCGT TTCCAAAATT CAGATTCTGC CTCTGCGGAT AAATATTTGC 
CACGAATGAG TAACTCCTGT CACCACTCTG AAGGTCCAGA CAGAAGGTTT TCACACATTC 
TTAGCACTGA ACTCCTCTGT GATCTAGGAT GATCTGTTCC CCCTCTGGAT GAACATCCTC 
TGATGATCAA GGCTCCCAGC AGGCTACTTT GAAGGGAACA ATCAGATGCA AAAGCTCTTG 
GGTGTTTATT TAAAATACTA GTGTCACTTT CTGAGTACCC GCCGCTTCAC AGGCTGAGTC 
CAGGCCTGTG TGCTTTGTAG AGCCAGCTGC TTGCTCACAG CCACATTTCC ATTTGCATCA 
TTACTGCCTT CACCTGCATA GTCACTCTTT TGATGCTGGG GAACCAAAAT GGTGATGATA 
TATAGACTTT ATGTATAGCC ACAGTTCATC CCCAACCCTA GTCTTCGAAA TGTTAATATT 
TGATAAATCT AGAAAATGCA TTCATACAAT TACAGAATTC AAATATTGCA AAAGGATGTG 
TGTCTTTCTC CCCGAGCTCC CCTGTTCCCC TTCATTGAAA ACCACCACGG TGCCATCTCT 
TGTGTATGCA GGGCTATGCA CCTGCAGGCA CGTGTGTATG CACTCCCCGC TTGTGTTTAC 
ACAAGCTGTG GGGTGTTACG CATGCCTGCT TTTTTCACTT AATAATACAG CTTGGAGAGA 
TTTTTGTATC ACATTATAAA TCCCACTCGC TCTTTTTGAT GGCCACATAA TAACTACTGC 
ATAATATGGA TACGCCTTAT TTGATTTAAC TAGTTCCCTA ATGATGGACT TTTAAGTTGT 
TTCCTTTTTT TTTCTTTTTT GCTACTGCAA ACGATGCTAT AATAAATGTC CTTATCAAAA 
AAAAAAAAAA AAAAAAAAAA AAAAAANCCC NGGGGGGGGG CCCCGGGAAC NCAAT 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH; 475 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
GGCACGAGAG AAATGTTTGA TTCTCTTTCC TATTTTAAGG GATCTTCTCT CTTGTTGATG 
TTGAAAACTT ACCTTAGTGA AGATGTCTTT CAACATGCTG TTGTCCTTTA CCTGCATAAT 
CACAGCTATG CATCTATTCA AAGTGATGAT CTGTGGGATA GTTTTAATGA GGTCACAAAC 
CAAACACTAG ATGTAAAGAG AATGATGAAA ACCTGGACCC TGCAGAAAGG ATTTCCTTTA 
GTGACTGTTC AAAAGAAAGG AAAGGAACTT TTTATACAAC AAGAGAGATT CTTTTTAAAT 
ATGAAGCCTG AAATTCAGCC TTCAGATACA AGGTACATGC CCTCTTTCTT TTCATGCCAT 
CTCTTTTGCA CTCTCAGGTG GAAATATTTT GAAGTGTTTT ATAATCATAA GTTCTTGTGA 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1255 



60 
120 
180 
240 
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AACCTAACAA GATTATCCCT TCCTAAGAAT ACTTAACCTT CCTACCAAAT TAAAA 475 



(2) INFORMATION FOR SEQ ID NO; 77: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 465 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY : linear 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

TTCTCTCTGC TCTTCGACTG CACCGCACTC GCGCGTGACC CTGACTCCCC CTAGTCAGCT 60 

CAGCGGTGCT GCCATGGCGT GGCGGCGGCG CGAACCRGCG TCGGGGCTCG CGGCGTGTTG 120 

20 GCTCTGGCGT TGCTCGCCCT GGCCCTGTGC GTGCCCGGGG CCCGGGGCCG GGCTCTCGAG 180 

TCX7TTCTCGG CCGTGGTAAA CATCGAGTAC GTGGACCCGC AGACCAACCT GACGGTGTGG 240 

AGCGTCTCGG AGAGTGGCCG CTTCGGCGAC AGCTCGCCCA AGGAGGGCGC GCATGGCCTG 300 

25 

GTGGGCGTCC CGTGGGCGCC CGGCGGAGAM CTCGARGGCT KCGCGCCCGA CACGCGCTTC 360 

TTCGTGCCCG AGCCCGGCGG CCGAGGGGCC GCGCCCTGGG TCGCCCTGGT GGTCGTGGGG 420 

30 GCTGCACCTT TCAAGGACAA AGTGCTGGTG GCGGCGCNGA ANGAA 465 

35 (2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1907 base pairs 
{B) TYPE: nucleic acid 
40 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 78: 

45 ACATGCAGCC CAACTACAGA TTCTTATGGA ATTCCTCAAG GTTGCAAGAA GAAATAAGAG 60 

AGAGCAACTG GAACAGATCC AGAAGGAGCT AAGTGTTTTG GAAGAGGATA TTAAGAGAGT 120 

GGAAGAAATG AGTGGCTTAT ACTCTCCTGT CAGTGAGGAT AGCACAGTGC CTCAATTTGA 180 

50 

AGCTCCTTCT CCATCACACA GTAGTATTAT TGATTCCACA GAATACAGCC AACCTCCAGG 240 

TTTCAGTGGC AGTTCTCAGA CAAAGAAACA GCCTTGGTAT AATAGCACGT TAGCATCAAG 300 

55 ACGAAAACGA CTTACTGCTC ATTTTGAAGA CTTGGAGCAG TGTTACTTTT CTACAAGGAT 360 

GTCTCGTATC TCAGATGACA GTCGAACTGC AAGCCAGTTG GATGAATTTC AGGAATGCTT 420 

GTCCAAGTTT ACTCGATATA ATTCAGTACG ACCTTTAGCC ACATTGTCAT ATGCTAGTGA 480 

60 
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TCTCTATAAT GGTTCCAGTA TAGTCTCTAG TATTGAATTT GACC3GGATT GTGACTATTT 540 

TGCGATTGCT GGAGTTACAA AGAAGATTAA AGTCTATGAA TATGACACTG TCATCCAGGA 600 

TGCAGTGGAT ATTCATTACC CTGAGAATGA AATGACCTGC AATTCGAAAA TCAGCTGTAT 660 

CAGTTGGAGT AGTTACCATA AGAACCTGTT AGCTAGCAGT GATTATGAAG GCACTGTTAT 720 

TTTATGGGAT GGATTCACAG GACAGAGGTC AAAGGTCTAT CAGGAGCATG AGAAGAGGTG 780 

TTGGAGTGTT GACTTTAATT TGATGGATCC TAAACTCTTG GCTTCAGGTT CTGATGATGC 840 

AAAAGTGAAG CTGTGGTCTA CCAATCTAGA CAACTCAGTG GCAAGCATTG AGGCAAAGGC 900 

TAATGTGTGC TGTGTTAAAT TCAGCCCCTC TTCCAGATAC CATTTGGCTT TCGGCTGTGC 960 

AGATCACTGT GTCCACTACT ATGATCTTCG TAACACTAAA CAGCCAATCA TGGTATTCAA 1020 

AGGACACCGT AAAGCAGTCT CTTATGCAAA GTTTGTGAGT GGTGAGGAAA TTGTCTCTGC 1080 

CTCAACAGAC AGTCAGCTAA AACTGTGGAA TGTAGGGAAA CCATACTGCC TACGTTCCTT 1140 

CAAGGGTCAT ATCAATGAAA AAAACTTTGT AGGCCTGGCT TCCAATGGAG ATTATATAGC 1200 

TTGTGGAAGT GAAAATAACT CTCTCTACCT GTACTATAAA GGACTTTCTA AGACTTTGCT 1260 

AACTTTTAAG TTTGATACAG TCAAAAGTGT TCTCGACAAA GACCGAAAAG AAGATGATAC 1320 

AAATGAATTT GTTAGTGCTG TGTGCTGGAG GGCACTACCA GATGGGGAGT CCAATGTGCT 1380 

GATTGCTGCT AACAGTCAGG GTACAATTAA GGTGCTAGAA TTGGTATGAA GGGTTAACTC 1440 

AAGTCAAATT GTACTTGATC CTGCTGAAAT ACATCTGCAG CTGACAATGA GAGAAGAAAC 1500 

AGAAAATGTC ATGTGATGTC TCTCCCCAAA GTCATCATGG GTTTTGGATT TGTTTTGAAT 1560 

ATTTTTTTCT TTTTTTCTTT TCCCTCCTTT ATGACCTTTG GGACATTGGG AATACCCAGC 1620 

CAACTCTCCA CCATCAATGT AACTCCATGG ACATTGCTGC TCTTGGTGGT GTTATCTAAT 1680 

TTTTGTGATA GGGAAACAAA TTCTTTTGAA TAAAAATAAA TAACAAAACA ATAAAAGTTT 1740 

ATTGAGCCAC AGTTGAGCTT GGAAAGTTTT TGTCAAATGC NGCAAGAGAT AACTCTTTTT 1800 

ANGAAGTAGC ATATGTGAAC TATAATGTAA CAGTGAATAA TTTGTAAAGT TCGTATTTCC 1860 

CAACCTCTTT GGGAATTACA CATATCAATA TAAACAAAAT ATAAAGT 1907 



50 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 1168 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
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GCTGGGGTGT CCCCKCSGCC ACCATCGTCA TCGCTTACTT GATGAAGCAC ACTCGGATGA 60 

CCCATGACTG ATGCTTATAA ATTTGTCAAA GGCAAACGAC CAATTATCTC CCCAAACCTT 120 

5 

AACTTCATGG GGCAGTTGCT AGAGTTCGAG GAAGACCTAA ACAACGGTGT GACACCGAGA 180 

ATCCTTACAC CAAAGCTGAT GGGCGTGGAG ACGGTTGTGT GACAATGGTC TGGATGGAAA 240 

10 GGATTGCTGC TCTCCATTAG GAGACAATGA GGAAGGAGGA TGGATTCTGG TTTTTTTTCT 300 

TTCTTTTTTT TTTTGTAGTT GGGAGTAAGT TTGTGAATGG AAACAAACTT GTTTAAACAC 360 

TTTATTTTTA ACAAGTGTAA GAAGACTATA ACTTTTGATG CCATTGAGAT TCACCTCCCA 420 

15 

CAAACTGACA AATTAAGGAG GTTAAAGAAG TAATTTTTTT AAGCCAACAA TAAAAATATA 480 

ATACAACTTG TTTCTCCCCC TTTTCCTTTT AAGCTATTTG TAGAGTTTAT GACTAAATAG 540 

20 TCTGTGCAGG TTCATAGACC GAAGATACTA CACACTTTAA ACCAATTAAA AAGAACCAAA 600 

AGTAAATAGA AAAGACATTG AATCACCAAG GCCTGGGATC AACCTGGGCT GTCCACACAG 660 

AAAACAAAAA CCCAACCAAA CCAAGCCCTG TTGTGCTCAC TGGTGCAAAG AGAAGATCAG 720 

25 

GGCAGCTTAA GTGGTCTAAG RATCCTTCAG GCATTCTTTA AGGAGAAAAA GGATACCTTT 780 

GATTTTGTGT GTTTCATGCT CTGGATTTTT TTTTTTTTTC CTTCTCTGGG TTTAAGAGAT 840 

30 TTTTTTTGAA ATAGTGAGGA ACTGACCATT ATATGCCTTC ACTGGCTTCT TGTGCAATAA 900 

TATGATGTTT TAAGTGTGCA AACAAGTTAG AGCTGGCAGC TGAATGATAG ACAAATAGTG 960 

CAAATTTGCC AGCTTGGAGA TAGAAAGGAA TTCAACAATA TATCAAATAC TTTCCTTCCC 1020 

35 

ACCTTTTTCC TTTTTTTTTT TTTTTTCTGA TTTGATTCTG GTTACAGTGC CATAAACCTT 1O80 

GTTACATATG TATATCAGAA TGTAAGAAAA AAAAATTTAT TTAAAAATAT TTTTCGCAAA 1140 

40 AAAAAAANNA AAAAACTCGA GGGGGGCC 1168 



45 (2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1285 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
55 AGAAAATCAC ATCCTAACAA AGAAGTCTGT CTAAGACAGT ACATCTCCTG TTGAACTTGC 60 
ATCTTTCCAC AGGACTTTCT GTTTTTAGGG ATGAGACTAT TCTCTGCTTC ATCAAGGAAA 120 
GAGAAATGTT CAGGGTTGTA GGGATGGCAC ACTTATTAGT TCTGCCTGTC TGAAAGGTTC 180 

60 
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CTGCAGGACA GTTTGGTCAG AGCTGCAATT CTTAGTCCAT GGTCTAATCC TTGAGTATCT 240 

CTTCTTTCCC TTTCCTGTCT caggaatcag ctgagaattc ATTCGATTCT CATCCCTCTA 300 

GCCCCTTACT GTGATTTGTT GGTTGCACTT TCATTTGCTT TAGTTCTAGA ATCACCTGTT 360 

GACTCCTCAG ACTTCACCTA ACTTTGGAAA CTCTCTTTTG GAGGCTTCTC ATTTCCCCCT 420 

AATTCTGTGC TGCCTGAGCC CTAGAATTTT CCCACCAACG AATTATTCCA GGTAGATCCT 480 

AAGTTGCTGG ATCTAGTTGA TATTTAAACA ATATCTAGTT GATATTTCTC ATTCAGTTGG 540 



(2) INFORMATION FOR SEQ ID NO: 81: 



600 
660 



ATCCAGAAAC CAGTATCTCT NAAAAACAAC CTCTCATACC TTGTGGACCT AATTTTGTGT 
15 GCGTGTGTGT GTGCGCGCAT ATGTATATAG ACAGGCACAT CTTTTTTACT TTTGTAAAAG 

CTTATGCCTC TTTGGTATCT ATATCTGTGA AAGTTTTAAT GATCTGCCAT AATGTCTTGG 720 

GGACCTTTGT CTTCTGTGTA AATGGTACTA GAGAAAACAC CTATATTATG AGTCAATCTA 780 

GTTGGTTTTA TTCGACATGA AGGAAATTTC CAGATAACAA CACTAACAAA CTCTCCCTTG 840 

ACTAGGGGGA CAAAGAAAAG CAAAACTGAC CATAAAAAAC AATTACCTGG TGAGAAGTTG 900 

25 CATAAACAGA ATTAGGTAGT ATATTGAAGA CAGCATCATT AAACAGTTAT GTTGTTCTCC 960 

TTGCAAAAAA CATGTACTGA CTTCCCGTTG AGTAATGCCA AGTTGTTTTT TTTATTATAA 1020 

AACTTGCCCT TCATTACATG TTTCAAAGTG GTGTCGTGGG CCAAAATATT GAAATGATGG 1080 

30 

AACTGACTGA TAAAGCTGTA CAAATAAGCA GTGTGCCTAA CAAGCAACAC AGTAATGTTG 1140 

ACATGCTTAA TTCACAAATG CTAATTTCAT TATAAATTGT TTTGCTAAAA TACACTTTGA 1200 

35 AACTATTTTT CTGTATTCCA AGAGCTGAGA TCTTAGATTT TATGTAGTAT TAAGTGAAAA 1260 

AATACGAAAA TAATAAACAT TGAAG 1285 



(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 1290 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

TCTCCAGCCC CAATTTCTAC GCGCACCGGA AGACGGAGGT CCTCTTTCCT TGCCTAACGC 60 

^ AGCCATGGCT CGTGGTCCCA AGAAGCATCT GAAGCGGGTG GCAGCTCCAA AGCATTGGAT 120 

GCTGGATAAA TTGACCGGTG TGTTTGCTCC TCGTCCATCC ACCGGTCCCC ACAAGTTGAG 180 

AGAGTGTCTC CCCCTCATCA TTTTCCTGAG GAACAGACTT AAGTATGCCC TGACAGGAGA 240 

60 TGAAGTAAAG AAGATTTGCA TGCAGCGGTT CATTAAAATC GATGGCAAGG TCCGAACTGA 300 
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TATAACCTAC CCTGCTGGAT TCATGGATGT CATCAGCATT GACAAGACGG GAGAGAATTT 360 

CCGTCTGATC TATGACACCA AGGGTCGCTT TGCTGTACAT CGTATTACAC CTGAGGAGGC 420 

5 

CAAGTACAAG TTGTGCAAAG TGAGAAAGAT CTTTGTGGGC ACAAAAGGAA TCCCTCATCT 480 

GGTGACTCAT GATGCCCGCA CCATCCGCTA CCCCGATCCC CTCATCAAGG TGAATGATAC 540 

10 CATTCAGATT GATTTAGAGA CTGGCAAGAT TACTGATTTC ATCAAGTTCC ATTCACCCAG 600 

CCAGGTGGTC TCGTCACCTC AGAGGCTCCG CAGACTCCTG CCCAGGCCAG GACTGAGGCA 660 

AGCCTCAAGG CACTTCTAGG ACCTGCCTCT TCTCACCAAG ATGAACTCAC TGGTTTCTTG 720 

15 

GCAGCTACTG CTTTTCCTCT GTGCCACCCA CTTTGGGGAG CCATTAGAAA AGGTGGCCTC 780 

TGTGGGGAAT TCTAGACCCA CAGGCCAGCA GCTAGAATCC CTGGGCCTCC TGGCCCCSGG 840 

20 GGAGCAGAGC CTGCCGTGCA CCGAGAGGAA GCCAGCTGCT ACTGCCAGGC TGAGCCGTCG 900 

GGGGACCTCG CTGTCCCCGC CCCCCGAGAG CTCCQGGAGC CCCCAGCAGC CGGGCCTGTC 960 

CGCCCCCCAC AGCCGCCAGA TCCCCGCACC CCAGGGCGCG GTGCTGGTGC AGCGGGAGAA 1020 

25 

GGACCTGCCG AACTACAACT GGAACTCCTT CGGCCTGCGC TTCGGCAAGC GGGAGGCGGC 1080 

ACCAGGGAAC CACGGCAGAA GCGCTGGGCG GGGCTGAGGG CGCAGGTGCG GGGCAGTGAA 1140 

30 CTTCAGACCC CAAAGGAGTC AGAGCATGCG GGGCGGGGGC GGGGGGCGGG GACGTAGGGC 1200 

TAAGGGAGGG GGCGCTGGAG CTTCCAACCC GAGGCAATAA AAGAAATGTT GCGTAACTCA 1260 

AAAAAAAAAA AAAAAAAANC TCGGGGGGGG 1290 

35 



(2) INFORMATION FOR SEQ ID NO: 82: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 684 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
45 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

TTTATTGTAT TCTGTAACTA TAGAACTTCT ATTTWATTCT TTTTTGGACT TGCTAAGTTG 60 

50 

TCTTTWATGG TTTTWAGTTC CATGCTGAAG TTTTCAGTAT TGACTTATCC CCTTGAACAT 120 

GAGTTGTTTT ATAGACTCTR ATGATTCAAA AATCTTACAT CTTTTGGTAG TCTCTTTCAT 180 

55 TTGTYCACTG TTTCTGTTGA TTCTOACTCA TGGTATTTTA ATTCTTCGTT WTTTTTTTTC 240 

TGTTWAGAWA CATTCTTTGA AAAATAATTT GGAGGAATAT TTGATTCTTA TGAACAAGGC 300 

ATTACTCACC AGAGAAGATT TTTTTGTTYT ACCARGTGCC TARGAATGCT AACAGTCTGG 360 

60 



WO 98/39448 



304 



PCT7US98/04493 



GAMCACATAG AMCACCAGGT GATGAGACAA TCCTGGGART CCTGTTTTAC TTTGGSCCAT 420 

CTTTTCTCCC AACCCTGTGG GAATARTCAT YCATATCCTA RCTGCAGGCT ARAAGGTGGT 480 

5 TTATCAGAGC CCAACTTCGA GGGCTCTGGG CTTTAGCTAC TGTCACCCCA TCATAACTGA 540 

GCTTCATGGA TTGATTCTCT TTTTATCTTT CAGATTTTCT TTTAAAAATC TTTGTTTTTT 600 

TTTTTCTTCC GAAAGATTCC CCCAACATTA CCATTCCCCA CCTTCCGTTG AATTTTTTTG 660 

10 

GCTCTCATTT TGAATTTTTC AAGA 684 



15 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2024 base pairs 
20 (B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

25 

CTGCAGGAAT TCGGCACAGC TGCGCTGGAG GCTTCATCTT TGCCGCCGCT GCCGTCGCCT 60 

TCCTGGGATT GGAGTCTCGA GCTTTCTTCG TTCGTTCGYC GGCGGGTTCG CGCCCTTCTC 120 

30 GCGCCTCGGG GCTGCGAGGC TGGGGAAGGG GTTGGAGGGG GCTGTTGATC GCCGCGTTTA 180 

AGTTGCGCTC GGGGCGGCCA TGTCGGCCGG CGAGGTCGAG CGCCTAGTGT CGGAGCTGAG 240 

CGGCGGGACC GGAGGGGATG AGGAGGAAGA GTGGCTCTAT GGCGATGAAA ATGAAGTTGA 300 

35 

AAGGCCAGAA GAAGAAAATG CCAGTGCTAA TCCTCCATCT GGAATTGAAG ATGAAACTGC 360 

TGAAAATGGT GTACCAAAAC CGAAAGTGAC TGAGACCGAA GATGATAGTG ATAGTGACAG 420 

40 CGATGATGAT GAAGATGATG TTCATGTCAC TATAGGAGAC ATTAAAACGG GAGCACCACA 480 

GTATGGGAGT TATGGTACAG CACCTGTAAA TCTTAACATC AAGACAGGGG GAAGAGTTTA 540 

TGGAACTACA GGGACAAAAG TCAAAGGAGT AGACCTTGAT GCACCTGGAA GCATTAATGG 600 

45 

AGTTCCACTC TTAGAGGTAG ATTTGGATTC TTTTGAAGAT AAACCATGGC GTAAACCTGG 660 

TGCTGATCTT TCTGATTATT TTAATTATGG GTTTAATGAA GATACCTGGA AAGCTTACTG 720 

50 TGAAAAACAA AAGAGGATAC GAATGGGACT TGAAGTTATA CCAGTAACCT CTACTACAAA 780 

TAAAATTACG GTACAGCAGG GAAGAACTGG AAACTCAGAG AAAGAAACTG CCCTTCCATC 840 

TACAAAAGCT GAGTTTACTT CTCCTCCTTC TTTGTTCAAG ACTGGGCTTC CACCGAGCAG 900 

55 

GAGATTACCT GGGGCAATTG ATGTPATCGG TCAGACTATA ACTATCAGCC GAGTAGAAGG 960 

CAGGCGACGG GCAAATGAGA ACAGCAACAT ACAGGTCCTT TCTGAAAGAT CTGCTACTGA 1020 

60 AGTAGACAAC AATTTTAGCA AACCACCTCC GTTTTTCCCT CCAGGAGCTC CTCCCACTCA 1080 
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CCTTCCACCT CCTCCATTTC TTCCACCTCC TCCGACTGTC AGCACTGCTC CACCTCTGAT 1140 

TCCACCACCG GGTTTTCCTC CTCCACCAGG CGCTCCACCT CCATCTCTTA TACCAACAAT 1200 

5 

AGAAAGTGGA CATTCCTCTG GTTATGATAG TCGTTCTGCA CGTGCATTTC CATATGGCAA 1260 

TGTTGCCTTT CCCCATCTTC CTGGTTCTGC TCCTTCGTGG CCTAGTCTTG TGGACACCAG 1320 

10 CAAGCAGTGG GACTATTATG CCAGAAGAGA GAAAGACCGA GATAGAGAGA GAGACAGAGA 1380 

CAGAGAGCGA GACCGTGATC GGGACAGAGA AAGAGAACGC ACCAGAGAGA GAGAGAGGGA 1440 

GCGTGATCAC AGTCCTACAC CAAGTGTTTT CAACAGCGAT GAAGAACGAT ACAGATACAG 1500 

15 

GGAATATGCA GAAAGAGGTT ATGAGCGTCA CAGAGCAAGT CGAGAAAAAG AAGAACGACA 1560 

TAGAGAAAGA CGACACAGGG AGAAAGAGGA AACCAGACAT AAGTCTTCTC GAAGTAATAG 1620 

20 TAGACGTCGC CATGAAAGTG AAGAAGGAGA TAGTCACAGG AGACACAAAC ACAAAAAATC 1680 

TAAAAGAAGC AAAGAAGGAA AAGAAGCGGG CAGTGAGCCT GCCCCTGAAC AGGAGAGCAC 1740 

CGAAGCTACA CCTGCAGAAT AGGCATGGTT TTCGCCTTTT GTGTATATTA GTACCAGAAG 1800 

25 

TAGATACTAT AAATCTTGTT ATTTTTCTGG ATAATGTTTA AGAAATTTAC CTTAAATCTT 1860 

GTTCTGTTTG TTAGTATGAA AAGTTAACTT TTTTTCCAAA ATAAAAGAGT GAATTTTTCA 1920 

30 TGTTAAGTTA AAAATCTTTG TCTTGTACTA TTTCAAAAAT AAAAAGACAG CAATGACTTT 1980 

ATATCCAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAGGGC GGCC 2024 



35 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS : 
40 (A) LENGTH: 931 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

CGCGCCMATA GCCGGACGGG GATCTGAGCT GGCAGGATGA ATGTGGGGGT GGCACACAGC 60 

GAAGTAAACC CCAACACCCG AGTGATGAAT AGCCGAGGCA TCTGGCTGGC CTACATCATC 120 

50 

TTGGTAGGAT TGCTGCATAT GGTTCTACTC AGCATCCCCT TCTTCAGCAT TCCTGTTGTC 180 

TGGACCCTGA CCAACGTCAT CCATAACCTG GCTACGTATG TCTTCCTTCA TACGGTGAAA 240 

55 GGGACACCCT TTGAGACTCC TGACCAAGGA AAGGCTCGGC TACTGACACA CTGGGAGCAA 300 

ATGGACTATG GGCTCCAGTT TACCTCTTCC CGCAAGTTCC TCAGCATCTC TCCTATTGTG 360 

CTCTATCTCC TGGCCAGCTT CTATACCAAG TATGATGCTG CGCACTTCCT CATCAACACA 420 

60 
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GCCTCATTGC TAAGTGTACT GCTGCCGAAG TTGCCCCAGT TCCATGGGGT TCGTGTCTTT 480 

GGCATCAACA AATACTGAGG GATGGGTTTT GGGACAGCTC CATGGGCATG GGGAAGGCAC 540 

TGAAACAGAG GACTATAAAA CATCCTTCTC TTATTCTCCA TACTGTCTTC TACACCTTTA 600 

AAGCCTGAGA ACTATACAAC CTTTCCCAGA CTCCCAAGAA GAGAAGAGAT TGGCAAATGG 660 

GGCTCCTGGG CCCAGTCCTG CTAGTGGCAA GTTTCTTTGA ATCAGGAAGG CAGGTGAGGT 720 

AAGGGCCAAA TCACTCTCCT CCATAGCAGG AAGCCATTTG GGCAGCTCCT TTGGTGATTA 780 

CATCTTTCCA TATCTTTTAC ACTTACCACC TTCCAGCTCT GTTTTGCTGT GTATTTTTCT 840 

15 TACAATAATT TTTTTCAGCT ATAGCTGCAG TTTAATCAGG ATGGGTAGAG AGCTGTCCTC 900 

ATAAGGCTGG GGGTGGGAAG ATGGAATACT G 931 



10 



20 



35 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 85: 



(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 825 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

CGGGGCCGGC GGGGTCTTCA GGGTACCGGG CTGGTTACAG CAGCTCTACC CCTCACGACG 60 

CAAACATGGC AGCGCAGAAG GACCAGCAGA AAGATGCCGA GGCGGAAGGG CTGAGCGGCA 120 

CGACCCTGCT GCCGAAGCTG ATTCCCTCCG GTGCAGGCCG GGAGTGGCTG GAGCGGCGCC 180 

GCGCGACCAT CCGGCCCTGG AGCACCTTCG TGGACCAGCA GCGCTTCTCA CGGCCCCGCA 240 

40 ACCTGGGAGA GCTGTGCCAG CGCCTCGTAC GCAACGTGGA GTACTACCAG AGCAACTATG 300 

TGTTCGTGTT CCTGGGCCTC ATCCTGTACT GTGTGGTGAC GTCCCCTATG TTGCTGGTGG 360 

CTCTGGCTGT CTTTTTCGGC GCCTGTTACA TTCTCTATCT GCGCACCTTG GAGTCCAAGC 420 

TTGTGCTCTT TGGCCGAGAG GTGAGCCCAG CGCATCAGTA TGCTCTGGCT GGAGGCATCT 480 

CCTTCCCCTT CTTCTGGCTG GCTGGTGCGG GCTCGGCCGT CTTCTGGGTG CTGGGAGCCA 540 

50 CCCTGGTGGT CATCGGCTCC CACGCTGCCT TCCACCAGAT TGAGGCTGTG GACGGGGAGG 600 

AGCTGCAGAT GGAACCCGTG TGAGGTGTCT TCTGGGACCT GCCGGCCTCC CGGGCCAGCT 660 

GCCCCACCCC TGCCCATGCC TGTCCTGCAC GGCTCTGCTG CTCGGGCCCA CAGCGCCGTC 720 

CCATCACAAG CCCGGGGAGG GATCCCGCCT TTGAAAATAA AGCTGTTATG GGTGTCATTC 780 

AGGAAAAAAA AAAAAAAAGG GGGGCCCCTC TAGGGGTCAA AGTTA 825 

60 
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(2) INFORMATION FOR SEQ ID NO: 86: 

5 <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1238 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 86: 
CATGTAAAAG GATGAAATGT GACTTCTGGT GTTTTTTTAT TTCTATGGAG GGACTTTCTG 60 
15 GGGACGGTTT CTGGCTCTCA GGCTCTGAGA AGCTGCAGTT TATGAGTGGC TCTGTGTGTG 120 
CTGCCACCTA CTGGAGAAGC CATAAGCTGC AGCTTTAGGA AAAGGGAACC CGGGGCAGAG 180 
TGTGGGGAAG TGGGATGGCA GCATGGCAGG GCTTTGGAAA ATGAGAGGTG AGAGTKTKTC 240 

20 

CAGGAAGGGT GTAAGGAGAG GATGGATCCT GATACATGGA TTCAGGATCA TTAGGGTCCT 300 
GTCTGGGACA CTGGCCTTCC TGCTTACCTG CTCTTTCCTT CCTCCTTGGT CGGAGGAGGG 360 
25 GCTGGCTCAC TGCTCTGGCT TCATTTTCCA GAGCTGCCTG CTGCAGTCAC ACTTAGGTCA 420 
TCTTCTCTCA CTTTTCTCCT TTTGCCGATT AGTGGACGTG ACAGAGATGT GAATGGQGCA 480 
GGGATGTCCT TTGATGGCAT CAAGACTTTA GCTTCTGGTG CGCTGTGTCC CAGCTCTGAT 540 

30 

TTCAGTTGCA GCCGTGATGG AMAGTTNGCA TGGAAGCTGA GACTCTCACT GACAGTGAAA 600 
CCCTCAAATG AACACAATCC CTGCTTTCCT GCCAAGGATC CTTGTAGGGT NCCCCCAGCT 660 
35 TCCCCACTTT TTTTCTGTGT CCTGACAAAG AAACACAGAG TAACTTGATT GCCCTGTGAC 720 
CTGGCCAGTT GCATTTCCCC TGCAGGCTTG AGCCCAAGCC AGAGCCTTGA AAAGGTATTC 780 
AGGTTGTTGC CCAAAACACT GAAAAAAACT GCCCTGGCCC TGAACCAAAT ACCTTGAACC 840 

40 

CTCGTAAACT CCATACCCTG ACCCCCTTGT TTTGGATATA CCCAGGTAGA ACAACTCTCT 900 
CTCACTGTCT GTTGTGAGGA TACGCTGTAG CCCACTCATT AAGTACATTC TCCTAATAAA 960 
45 TGCTTTGGAC TGATCACCCT GCCAGTCTTT TGTCTTGGGC AATCTATACT TTTNCTCAGA 1020 
GGTTCCCAAG GCCTACTGAA GGGACTTAAC ATACTCTTAA TGGCTTTCCT CTCTCTTGTT 1080 
TTACCTTATG CCCTCACTTC CTGAGTTAAC CTCCCAAATA CAGGATTCAC CTGTACCCAA 1140 

50 

GCCCTTAGCT TCAAGAATAC AGGATCACCT GTACCCAAGC CCTTAGCTCA AGCTCTGCTT 1200 
TGGAAGAACC CAAACTAAGA CAGTGCTCCT GGTGCCCT 1238 

55 



60 



(2) INFORMATION FOR SEQ ID NO: 87: 

<i> SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1460 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

ATTGCCTTCT GGTCCCTGGT GACACTGGGG TCATCCTTCA TCCCCGGAGA GOVITTCTGG 60 

10 CTGCTCCTCC TGACCCGGGG CCTGGTGGGG GTCGGGGAGG CCAGTTATTC CACCATCGCG 120 

CCCACTCTCA TTGCCGACCT CTTTGTGGCC GACCAGCGCG ACCGGATGCT CAGCATCTTC 180 

TACTTTGCCA TTCCGGTGGG CAGTGGTCTG GGCTACATTG CAGGCTCCAA AGTGAAGGAT 240 

15 

ATGGCTGGAG ACTGGCACTG GGCTCTGAGG GTGACACCGG GTCTAGGAGT GGTGGCCGTT 300 

CTGCTGCTGT TCCTGGTAGT GCGGGAGCCG CCAAGGGGAG CCGTGGAGCG CCACTCAGAT 360 

20 TTGCCACCCC TGAACCCCAC CTCGTGGTGG GCAGATCTGA GGGCTCTGGC AAGAAATCCT 420 

AGTTTCGTCC TGTCTTCCCT GGGCTTCACT GCTGTGGCCT TTGTCACGGG CTCCCTGGCT 480 

CTGTGGGCTC CGGCATTCCT GCTGCGTTCC CGCGTGGTCC TTGGGGAGAC CCCACCCTGC 540 

25 

CTTCCCGGAG ACTCCTGCTC TTCCTCTGAC AGTCTCATCT TTGGACTCAT CACCTGCCTG 600 

ACCGGAGTCC TGGGTGTGGG CCTGGGTGTG GAGATCAGCC GCCGGCTCCG CCACTCCAAC 660 

30 CCCCGGGCTG ATCCCCTGGT CTGTGCCACT GGCCTCCTGG GCTCTGCACC CTTCCTCTTC 720 

CIOTCCCTTG CCTGCGCCCG TGGTAGCATC GTGGCCACTT ATATTTTCAT CTTCATTGGA 780 

GAGACCCTCC TGTCCATGAA CTGGGCCATC GTGGCCGACA TTCTGCTGTA CGTGGTGATC 840 

35 

CCTACCCGAC GCTCCACCGC CGAGGCCTTC CAGATCGTGC TGTCCCACCT GCTGGGTGAT 900 

GCTGGGAGCC CCTACCTCAT TGGCCTGATC TCTGACCGCC TGCGCCGGAA CTGGCCCCCC 960 

40 TCCTTCTTGT CCGAGTTCCG GGCTCTGCAG TTCTCGCTCA TGCTCTGCGC GTTTGTTGGG 1020 

GCACTGGGCG GCGCACTTCC TGGGCACCGC CATCTTCATT GAGGCCGACC GCCGGCGGGC 1080 

ACAGCTGCAC GTGCAGGGCC TGCTGCACGA AGCAGGGTCC ACAGACGACC GGATTGTGGT 1140 

45 

GCCCCAGCGQ GGCCGCTCCA CCCGCGTGCC CGTGGCCAGT GTGCTCATCT GGAGAGGCTG 1200 

CCGCTCACCT ACCTGCACAT CTGCCACAGC TGGCCCTGGG CCCACCCCAC GAAGGGCCTG 1260 

50 GGCCTAAACC CCTTGGCCTG GCCCAGCTTC CAGAGGGACC CTGGGCCGTG TGCCAGCTCC 1320 

CAGACACTAC ATGGGTAGCT CAGGGGAGGA GGTGGGGGTC CAGGAGGGGG ATCCCTCTCC 1380 

AACAGGGGCA GCCCCAAGGG CTCGGTGCTA TTTGTAACGG GATTAAAATT TGTAGCCAGA 1440 

55 

AAAAAAAAAA AAAAAAAAAA 1460 



60 
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(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1395 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

CAGGTGCAAA GTGGGAAGTG TGAGTCCTCA GTCTTGGGCT ATTCGGCCAC GTGCCTGCCG 60 

GACATGGGAC GCTGGAGQGT CAGCAGCGTG GAGTCCTGGC CTTTTGCGTC CACGGGTGGG 120 

15 AAATTGGCCA TTGCCACGGC GGGAACTGGG ACTCAGGCTG CCCCCCGGCC GTTTCTCATC 180 

CGTCCACCGG AYTCGTGGGC GCTCGCACTG GCGCTGATGT AGTTTCCTGA CCTCTGACCC 240 



10 



20 



GTATTGTCTC CAGATTAAAG GTACGACATT TGGAGGCCCC AGCGAGAAAC GTCACCGGGA 300 

GAAACGTCAC CGGGCGAGAG CGGKCCCGCT GTGTGCTCCC CCGGAAGGAC AGCCAGCTTG 360 

TAGGGGGGAG TGCCACCTGA AAAAAAAATT TCCAGGTCCC CAAAGGGTGA CCGTCTTCCG 420 

25 GAGACAGCGG ATCGACTACC ATGTGGGTGC CCACAAAAAT TYCACCTYTG AGTCCTCAAC 480 

TGCTGACCCC GGGGTCAGTT CCAGAGAGAA GGACTCCCTC CTGCTTGGAA GAGACCTCAC 540 



30 



600 



ACCGTCATCA CGATGCCAAC GGCTCTGAAG GTGGATGGCA TTCCTGCGTG GATTCATCAC 
TCCCGCATCA AAAAGGCCAA CRGAGCCCAA CTAGAAACAT GGGTCCCCAG GGCTOGGTCA 660 
GGCCCCTTAA AACTGCACCT AAGTTGGGTG AAGCCATTAG ATTAATTCTT TTTCTTAATT 720 
35 TTGTAAAACA ATGCATAGCT TCTGTCAACT TATGTATCTT AAGACTCAAT ATAACCCCCT 
TGTTATAACT GAGGGAATCA ATGATTTGAT TCCCCAAAAA CACAAGTGGG GAATGTAGTG 
TCCAACCTGG TTTTTACTAA CCCTGTTTTT AGACTYTCCC TTTCCTTTAA TCACTCAGCC 

40 

TTGTTTCCAC CTGAATTGAC TCTCCCTTAG CTAAGAGCGC CAGATGGACT CCATCTTGGC 
TCTTTCNACT GGCAGCCGCT TCCTYCAAGG ACTTAACTTG TGCAAGCTGA CTCCCAGCAC 1020 

45 ATCCAAGAAT GCAATTAACT GATAAGATAC TGTGGCAAGC TATATCCGCA GTTCCCAGGA 1080 
ATTCGTCCAA TTGATTACAC CCMAAAGCCC CGCGTCTATC ACCTTGTAAT AATCTTAAAG 1140 

^ CCCCTGCACC TGGAACTATT AACGTTCCTG TAACCATTTA TCCTTTTAAC TTTTTTGCCT 1200 
ACTTTATTTC TGTAAAATTG TTTTAACTAG ACCCCCCCTC TCCTTTCTAA ACCAAAGTAT 1260 
AAAAGCAAAT CTAGCCCCTT CTTCAGGCCG AGAGAATTTC GAGCGTTAGC CGTCTCTTGG 1320 

55 CCACCAGCTA AATAAACGGA TTCTTCATGT GTAAAAAAAA AAAAAAAAAA CTCGGAGGGG 
GGGCCCGGTA CCCAA 



780 
840 
900 
960 



1380 
1395 
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(2) INFORMATION FOR SEQ ID NO: 89: 

(l) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 1186 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

GGCACGAGCC GGCAAGCCGA GCTAGGGTGA AAACTGGGGG CGCACCAGGA TGTNNGACAG 60 

^ AAAAGCAGAA GATGAGACTC TGTTCATTCA CTTTTCCTAG GCCCATCCTG TGGTCATCTT 120 

TCCCCCTCCC ATCATACCTC CTCCTTCCTG GAGCCTCTGC CGGCTTGGCT GTAATGGTGG 180 

CACTTACCTG GATATTTCAG TGGGAGGATG AAAGGCGAGA CTCACCCTAC GCGGTGGGAC 240 

20 AGATGGGGAG AGGAAAAAGG CAGAGATGGC CAGGAGAGGG GTGCAGGACA AACCAGAGAG 300 

GTTGGGTCAG GGGAAAAGGG TGGGGAGAAA GAGGGGTGCA GGCCCTGCAG GCCGGTTAGC 360 

CAGCAGCTGC GGCCTCCCCG GGCCCTTGGC ATCCAACTTC GCAGACAGGG TACCAGCCTC 420 

25 

CTGGTGTGTA TCATAGGATT TGTTCACATA GTGTTATGCA TGATCTTCGT AAGGTTAAGA 480 

AGCCGTGGTG GTGCACCATG ACATCCAACC CGTATATATA AAGATAAATA TATATATATA 540 

30 TGTATGTAAA TTATGGCACG AGAAATTATA GCACTGAGGG CCCTGCTGCC CTGCTGGACC 600 

AAGCAAAACT AAGCCTTTTG GTTTGGGTAT TATGTTTCGT TTTGTTATTT GTTTGTTTTT 660 

GTGGCTTGTC TTATGTCGTG ATAGCACAAG TGCCAGTCGG ATTGCTCTGT ATTACAGAAT 720 

35 

AGTGTTTTTA ATTCATCAAT GTTCTAGTTA ATGTCTACCT CAGCACCTCC TCTTAGCCTA 7B0 

ATTTTAGGAG GTTGCCCAAT TTTGTTTCTT CAATTTTACT GGTTACTTTT TTGTACAAAT 840 

40 CAATCTCTTT CTCTCTTTCT CTCCTCCCCA CCTCTCACCC TTGCCCTCTC CATCTCCCTC 900 

TCCCGCCCTC CCCTCCTCCC TCTGGCTCCC CGTCTCATTT CTGTCCACTC CATTCTCTCT 960 

CCCTCTCTCC TGCCTCCTGC TGCCCCCTCC CCAGCCCACT TCCCCGAGTT GTGCTTGCCG 1020 

45 

CTCCTTATCT GTTCTAGTTC CGAAGCAGTT TCACTCGAAG TTGTGCAGTC CTGGTTGCAG 1080 

CTTTCCGCAT CTGCCTTCGT TTCGTGTAGA TTGACGCGTT TCTTTGTAAT TTCAGTGTTT 1140 

50 CTGACAAGAT TTAAAAAAAA AAAAAGGAAA AAAAAAAAAA AAAAAA 1186 



55 (2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1821 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS: double 
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10 



(D) TOPOLOGY; linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

AAAACATGCT TTCAGGGCGT CCCCTATGTA TTCGGGGGGC CCACGGACAC TCAGGCTGGA 60 

KATCCGTCCT CACTGCGCTC AAGATGGCCT CAGCAGACAC CAGTTACCCA GCTGAAAGTC 120 

ACAATCCCTC CCAGAAGTCT CCCAACACTA GTGCTGACCA GAGGTGGGGC TCTCAGGCTA 180 

GGAGTTTCAC ACACAATGAC AGGCTGCTGG GGGACATTGC AGGACCCCTT TTCCTYTCCT 240 

CTCCATGCTA GAAGCCAGCC CTAGGMAGCT GCAGTTACTC CCTGTGACTC AGCAGCAGGC 300 

15 TGATTCAACA CAGCTGCCCA CACAAAGCCA GTGGTAATAC ATCTGTTTAC CTTTCCCTAT 360 

CACCCAGACA CAAGCCCCTT TCCCAGGTCA AACCACAGGC CGATGCATCT CCAGTTTGAC 420 

AGTCAAATCA CTACTTCCAT TGCTACTTTA GATCAGCCAA AGTGGTGACT GCTGCAGTGT 4 B0 

20 

GTGGCTATCC CTACAAGGCC CACCCAAGGG ATGCCCAAAG CCCAACCTTC TCCAGGGCTG 540 

CAGCAGNAGC AACCCCACCA GCCTAAGTCC AGCAGAGGAC CTCCCACCCA ATGTCTTGTT 600 

25 CTAATTAGAA GGGGAAGTTA GCCACAGAAA ATCAACTTAT CTATAATTAC AAAATTCTCT 660 

TGACTCACCT TAAAGTTCCT ATTGACATCT ACTGCTTTTA AACCTATTTG AAAACTCTGA 720 

TACTAAAACA AATGACACTC TAAGAAAGTT TGGGAGCCCC ATGCTGAGAA CCATTTCTGT 780 

30 

GCAGTGAGGA TGTTTCCAGA AGCTACTTAC CTACATGTGA ATGTGCCATT TTCTTTCCTT 840 

TTGTAGAGAA AATCCCCTTT ACTTTTTGGA ACAGTAATGG CAGCTTCTAG TAC AGO CATT 900 

35 ACAGTTTCAT ATGAGAAAAA TTAAGAATAA CTATAAAATT GTTAAAATAT CCAATAATGG 960 

ATAATGATGG CCAGAAGATT TAACATACAA AGTAATTCTC AATGTAAAGC TATTCAGCTC 1020 

TTCCAGGTTG AATGCCCTGT AACCCACCCT GACCTTCCAC ATCATCTTCA AAAAGCAGTT 1080 

40 

TCTCTGTTCC CCATGATTCT CCTATAAGGT AACTCTTTAG TCCTCCATTT AGCACATTTT 1140 

AAATCCTCCA AAGAATAAGT ATCATGTGAT TATTTTAGCT TTACAAAAAA AAAGTTGAAT 1200 

45 GGCGTTTTAT TTTCATGGCC TATAAGCAGG TACCTTAGTA GGGCAGATAT AGGAAAAACA 1260 

AATTAGAGCA AAACAAATCC TCTACAAATC CAAGGCAGGA AAAGTGGTGG CAGAGTGACT 1320 

CATTCTCCTG TCCCTCCCAT CAGGTCAAAT CAGGAGGCTG CAGTGAATGC CTGTTCTTTG 1380 

50 

AATGTGTAGC AGTTGTTCCT GTAACTCTTT AAAACTTGGC TATAGGCTGT TTAGCACAGT 1440 

ACAGATTAAA GATACAGTTA CGTAAACAGC AAAGTAATTT TATAGTGCTT CATCCATTTA 1500 

55 TCATGCTTTG GTTTGCTAAT TTTTTCACAT ACCTTTTTCT ATCACAGTCT GTTGCTTTTG 1560 

TACACATTTC TCATATTGGG GTTCGACAGG TAAACACAAA CTGCTATTTC AGTAGAAAAA 1620 

GTTATTGTTA TGGAATATTA AACCCAATAA ATTGTATAAA GGGTAAAAAA AAAAAAAAAA 1680 

60 
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AAAAAAAAAA AAAAAAAAAA AAAAAAATTC CTGCGGGCCG CANGCTTTTT CCCTTTGGGT 1740 
GAGGOGTTAT TTTNGGCTTG GGCACTGGGC CCTTCGTTTT TACAACGTCG TGANGGGGGG 1800 
AACCCGGGGG GGGTTTCCCC C 1821 



10 (2) INFORMATION FOR SEQ ID NO: 91: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 862 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

TGCCCTTTTT CCCACCGATT CGGGGOJTGG TGAAGGTGGG AGATGTGAAC TCCAATTAAG 60 

GGACTGGAGA GAGGTGAAGA ATTTTGCAGG TGGGAGATTT GGATTTGAAT GTGGACTTGT 120 

AAATGACTTG ACCTTGCCAT CTGTGTTCAA GGTCACGGTT TGCTGTGGGG TTCCTGGGAG 180 

AGCTTACTCA CCCCGGAGTC TTTTCTTTCT CTTGCTCCAA GAAGAGCCCT GTTGGTGCTT 240 

TACCACCGCT TGGAGTCTCC CGAGGACACA AACAGGCAGA GAGGGACGTC TAGGGAGAGT 300 

TCTTTCCTGT TTTCTGTGCT TTCCTTTTTA CAGGACTCCC GGAAGGCCAC TCATGGCCAT 360 

GCCAGGAGCT TTCTCAGAAA CAGTCATAAA CGATCTCTTG AGTCTCTTTC TTGTCCTCCC 420 

AGCTGAGCTT TCTTATTCCA CCCTTTCTGG TGTCTATAGG AATGCATGAG AAGACCCTGG 480 

GACGTTTTTC TGCTCTCTTC TGGCCCTCCA TGGAGCCATG GGCCTCGGCC TCGGCGGCTC 540 

CTCACCCTCA CAATTTATTT CCTCCTCCCG TGCCAGCCCT TCTTTTGTGT CTGAAACCGG 600 

TTTTAAAATG TGACTCTCCC AGAGAAGAAG CCGCTGGCTG TATGAAACTT GACGGCGCTT 660 

TTGTAAGGTG CCACCCCCAA ACTTTAAGGT AGCTAAACCA ATTTTTAAAA GATTCAATGG 720 

CTTGTTCATC CTCCAGATGT AGCTATTGAT GTACACTTCG CAACGGAGTC TCTGAAATTG 780 

TGGTGGTCCT GATTTATAGG ATTTCATAAT TAAAATGTCT GCTGAATAAA AAAAAAAAAA 840 

AAAAACTCGA GGGGGGCCCG GT 862 

(2) INFORMATION FOR SEQ ID NO: 92: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 696 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

60 



15 



20 



25 



30 



35 



40 



45 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 92: 

CTGAGGCGAG TGAAGTGGAC TCTGAGGGCT ACCGCTACCG CCACTGCTGC GGCAGGGGCG 60 

5 TGGAGGGCAG AGQGCCGCGG AGGCCGCAGT TGCAAACATG GCTCAGAGCA GAGACGGCGG 120 

AAACCCGTTC GCCGAGCCCA GCGAGCTTGA CAACCCCTTT CAGGACCCAG CTGTGATCCA 180 

GCACCGACCC AGCCGGCAGT ATGCCACGCT TGACGTCTAC AACCCTTTTG AGACCCGGGA 240 

10 

GCCACCACCA GCCTATGAGC CTCCAGCCCC TGCCCCATTG CCTCCACCCT CAGCTCCCTC 300 

CTTGCAGCCC TCGAGAAAGC TCAGCCCCAC AGAACCTAAG AACTATGGCT CATACAGCAC 360 

15 TCAGGCCTCA GCTGCAGCAG CCACAGCTGA GCTGCTGAAG AAACAGGAGG AGCTCAACCG 420 

GAAGGCAGAG GAGTTGGACC GAAGGAGCGA GAGCTGCAGC ATGCTGCCCT GGGRGGCACA 480 

GCTACTCGAC AGAACAATTG GCCCCCTCTA CCTTCTTTTT GTCCAGTTCA GCCCTGCTTT 540 

20 

TTCCAGGACA TCTCCATGGA GATCCCCCAA GAATTTCAGA AGACTGTATC CACCATGTAC 600 

TACCTCTGGA TGTGCAGCAC GSTGGNTCTT CTCCTGAAYT TCMTCGSCTG CCTGGCCAGT 660 

25 TCTGTGTGGA AACCAACAAT GGCGAGGCTT TGGGTT 696 



30 (2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1886 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

40 CAGGCCACTG ACGCTTCTTT GCGAGGGATG CAGGAGGTCC TACAGAGAAA GGCGCTTCTT 60 

GCATKTCAGA GGGCCCACAG CCTGTCACCC ACAGATCACC AAGCAGCTTT CTACCTGGCT 120 

CTGCAGCTTG CCATCTCCAG ACAGATCCCA GAGGCTCTGG GGTATGTCCG CCAAGCTCTT 180 

45 

CAGCTTCAAG GTGACGATGC CAACTCCCTG CACCTCCTTG CCCTCCTGCT GTCAGCACAG 240 

AAGCATTACC ATGACGCTCT GAACATCATC GACATGGCCC TGAGTGAATA CCCAGAAAAT 300 

50 TTCATACTAC TGTTTTCCAA AGTGAAGTTG CAGTCACTCT GCCGAGGCCC GGACGARGCA 360 

CTGCTGACTT GTAAGCACAT GCTGCAGATA TGGAAATCCT GCTACAACCT CACCAACCCC 420 

AGTGATTCTG GACGTGGGAG CAGCCTCTTA GATAGAACCA TTGCTGACAG ACGACAGCTT 480 

55 

AATACAATTA CTTTGCCAGA CTTCAGCGAT CCCGAGACAG GCTCCGTCCA TGCCACATCG 540 

GTAGCAGCCT CAAGAGTGGA GCAGGCACTG TCGGAAGTGG CTTCGTCTCT GCAGAGCATG 600 

60 CCCCTAAGCA GGGCCCGCTG CACCCCTGGA TGACGCTGGC ACAGATCTGG CTCCATGCAG 660 
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35 



40 



CTGAAGTCTA TATCGGCATC GGGAAGCCTG CAGAAGCCAC AGCCTGTACC CAAGAAGCTG 720 

CCAACCTCTT CCCAATGTCC CACAATGTCC TCTACATGCG CGGCCAGATT GCTGAGCTCC 780 

GGGGAAGCAT GGACGAGGCG CGGCGGTGGT ATGAAGAGGC CTTAGCCAOT CAGCCCCACC 840 

CACGTGAAGA GCATGCAGCG ACTTGGCCCT GATCCTTCAC CAGYTAGGCC GYTACAGTYT 900 

GGCGGAGAAG ATCCTCCGGG ACGCGGTGCA GGTGAACTCG ACAGCCCACG AGGTCTGGAA 960 

CGGGCTGGGC GAGGTCCTCC AAGCTCAGGG CAACGATGCG GCGGCTACGG AGTGCTTCCT 1020 

GACAGCCTTG GAGCTGGAGG CCAGCAGCCC CGCCGTGCCC TTCACCATCA TCCCCCGCGT 1080 

GCTCTGAGCA GGCGCCTGCC AGCCTCACCT GCCGCTCAGC CTNCAGAGGC CCTGCCGGGC 1140 

ACCAGGGCTT GTGCCATCGC CCCAAGGGGA TGAATCTGCC GCACTGAGGC CAGGGACGAG 1200 

TGTTCAGTGG GCCACAGTGA ACCAACCAAA CCAACCCCGA ATCATCGCTC TCGCCATGTG 1260 

CGTTTCTCTT GTTTTTTTTG CCAGCCCAAT GGTAGTTTCT GAACCTATTG ACATTGTTCA 1320 

AAATGGATCA TGTGCCATAT TTTGTTAGTT GACATCTGAG TTTTCAGTAA AATGATTATG 1380 

GAATTAATCA GCAAATGTAG AAGAATATAT TCAAAGTTAA AATTCAGTGG CAGCACAGAT 1440 

TATTTTTATC AGAGCTGTAA AGAAAACAAC TGTCCTTTTC TCCCCACCAC CCCTCCTGCC 1500 

CCACTTTGGC CCAGAAACCA AATGTGAACT TCCTGTCTCC CACCTCAGCA CTAGTCCATG 1560 

CCAGGACACC AGCTGACAAT TTCTTGGTTT TACTGTCAAT AATTGTACCA TGTGATCAAT 1620 

TACTGTCCTC ACTTAGAACA AAGCCTGAGT CCGAGAATAT TTATATTTTA CCAATATATG 1680 

CCTGTTACAA GAGAAGGAAA TATGAGTTAT TTAAGTTTAA CTTTTTTATG TGAATTCAGA 1740 

GTTTATTTAT CGAQGGAAAT ATGTACAAAG AAGCTTCAAA TGGAATATTT ACCGACATTC 1800 

CTTATACATG ACAGACACTT GGCTACATGG GAAGATGATG TTAATAATAA AATGATTTTT I860 

AAATGGAAAA AAAAAAAAAA AAAAAN 1886 



45 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 94: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1774 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
CTCAGCTACC GTATACAGTA GGACATAACC CCATTTCACA TGCACTACAC TGAGACTTGC 
CTCCTCTCCC CCCACATTGA AGATGTTCTT TTTTCATAAC TATATACTAT TCCATTGCAT 



60 
120 
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315 

GAATATTCTG TAATTTATTT AATCCCCTAT GGATTGATAA TTAGGTTCAT TATAGATAGA 180 

AGTGTAATTA ACATTCCTGT ACATGTATTT TGCTACTTGT GTGGGTATTT CTGTAGGATG 240 

5 AATAACTAGA AATTTATTGG ATCAGGTTTC ACATTTGCAG TTTTGAAAAC TACTACCAAA 300 

AAGATTTCAC CAATTTACAA CTCCATCATT AGTAAGAATG CCTGTTTGCC TATAGTCTGC 360 

CAACCCTGAA TCCTTAAAAA TTTTTGCCAA TCTGGTAGGC AAAATTTCTT TXTTTTTCTTT 420 

GAATATTAAT GAGGAGGAAC ATCTTTTCAT GTTTCTTGGC CATTTGCATT TCCTATTATG 480 

AATTGCTTTT GCC CATTTTC CTTTTTTTAA TTATGAAAGT CTAATGACTA CCTTCTCATT 540 

15 GTATAAAAAA CACAGTTCTT TGAATAGAGA GACCCTTTTC TCCAATGCTA CCAATCACAT 600 

TCCACTTACC ACAGTTTAAC ATACATCCTC TAGTCACCTT TCCGTACGAA TATACATACA 660 

CATAAAAACA CTTTTTACAT AAATAGGATC TCATATTCTG TAGCTTTTTA AAATTTTGGT 720 

CTCAAAAAAA GATAACAGGT CTTTAAATTT CTTTAATGGT TGAATATGAT TAAATACTAT 780 

GAAAATGCCA TTATTTATTC CCTTAATTTT TTTCCTCTCG CTATTACATT GCCAAAGTAA 840 

25 ACATCCTATT CAGATGTCTT TGTGCATGTG TGTGAATATT TCTPTAGTCT GGAGTCCAGT 900 

AAGGTGGATT TTTGGATCAA AGGGTTTGTT CTCTGTCCAC CTTCAGTCTT CCCAAAGGCC 960 

TTCATAACTG TATTTTCACC AAGTGTATGG AGAATGTTCA TTTCCCCATA TAACCATACC 1020 

TACACTTGAT AGTTTTTATC TGTTGGGCGA AAAAGAACCT TTTCTTATTT TGCATTTCCC 1080 

TGATTATAAA AAAAAATGGT GAGATTGGGG TTATTTTCAT GTTTATTCGC CATTTATAGT 1140 

35 TTACTGTGGA TTGTTTGTAT CCCTTACCTG CTTTCTATTG GGTTATGTGT GGATATATTG 1200 

TTTTTATTTG TTCAGCATCT CCTTCCCCAT CTTCTGGTAA CACAACCTTT ATTTATTTGT 1260 

GGGGAACCTA TTCCCTGTGG CTTAGGTGAG CATGTGACCA GGCCTGGCCT CCTGAGTCCC 1320 

40 

ACAGCTTCCT AGCCACAGTG ATAAAAGAAT GGGTATATAA CTTAAGCCAG GCTAAGGAAA 1380 

GCCCTTAACA GAACTTCTGC TGGAACTACT GGAAAGAAGG CTTTATGGAG ATCCCAGGAA 1440 

45 CCAAGGACCA TGTAAGCCTG AATTTGTGCC ATGTGGAGAG AGTCTGTCTG AGGAGAAACT 1500 

CGGATGCTAG CAGAAATGGA AAGAGAACTA AGTTCTGATG TCATTTTTCT GGAGGCCCTA 1560 

GATCCAGCTG TGCCTAAAGC CIX3CCCTACT CCGGACTTTA AAGTTTTGTG AGCCAATAAA 1620 

:>0 

GTCCCTTTCT TGTTTAAGAT AATTGAATTG AGTTTCTGTT CTGATTAATA TAGGTTATTT 1680 

GTATTTTCTT ATTGATTTGT AGAAAACCTT TGTAATTTTA AATTCTAGAC TTTATGCACT 1740 

55 ATATAAGTTA ATAAAATTAG CATGGCCTTC CATG 1774 
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(2) INFORMATION FOR SEQ ID NO: 95: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2503 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

10 GGCACGAGCG AAGGCAAGGG GGCACCAGCT CAGGACTGCA TCTCCCTGCC ATTTCCCTTC 60 

CACTCCTCCT TTCTGGAGTC TGACATTAGA AAGCCAGCGA GAAGGAAGAT TCAAACAACC 120 

^ AACCCTGATT TCCTGCTTCT CCTTTTCATG AGTGTTCCTG TGGTCTCTGC ACCTCCTTTC 180 

TGTCCCCCGG CAGAGGGCAG TAGAGATGGC CGGCCCAAGG CCTCRGTGGC GCGACCAGCT 240 

GCTGTTCATG AGCATCATAG TCCTCGTGAT TGTGGTCATC TGCCTGATGT TATACGCTCT 300 

20 TCTCTGGGAG GCTGGCAACC TCACTGACCT GCCCAACCTG AGAATCGGCT TCTATAACTT 360 

CTGCCTGTGG AATGAGGACA CCAGCACCCT ACAGTGTCAC CAGTTCCCTG AGCTGGAAGC 420 

CCTGGGQGTG CCTCGGGTTG GCCTGGGCCT GGCCAGGCTT GGCGTGTACG GGTCCCTGGT 480 

CCTCACCCTC TTTGCCCCCC AGCCTCTCCT CCTAGCCCAG TGCAACAKTG ATGAGAGAGC 540 

GTGGCGSCTG GCAGTGGGCT TCCTGGCTGT KTCCTCTGTG CTGCTGGCAG GCGGCCTGGG 600 

30 CCTCTTCCTC TCCTATGTGT GGAATGGGTC ARGCTCTCCC TCCCGGGGCC TGGGTTTCTA 660 

GCTCTGGGCA GCGCCCAGSC CTTACTCATC CTCTTGCTTA TAGCCATGGC TGTGTTCCCT 720 

^ CTGAGGGCTG AGAGGGCTGA GAGCAAGCTT GAGAGCTGCT AAAGGCTTAC GTGATTGCAA 780 
GGGTTCAGTT CCAACCATGG TCAGAGGTGG CACATCTGCT CAGCCATCTC ATTTTACAGC 



840 
900 
960 



TAACGCTGAT CTCCAGCTCC AGCGATGGAA CCCACTACAG AGGAGGTGGG GCCCCTGTGT 
40 CAAAGAGGCC GAGGGGCAGC AAGGGCAGMC AGGGCACCTG TGACTTCTTA GTACAAGATT 

GTCTGTCCTT CAGGACTTCC AAGGCTCCCA AAGACTCCCT AAACCATGCA GCTCATTGTC 1020 

ACACCAATTC CTGCTTTAAT TAATGGATCT GAGCAAATCT TCCTCTAGCT TCAGGAGGGT 1080 

45 

GGGGAGGGAG TGATTGCTGT CATGGGGCCA GACTTCCAGG CTGATTTGCC AAATGCCAAA 1140 

ATGAAACCTA GCAAAGAACT TACGGCAACA AACGAGGACA TTAAAAGAGC GAGCACCTCA 1200 

50 GTGTCTCTGG GGACATGGTT AAGGAGCTTC CACTCAGCCC ACCATAGTGA GTGGGCCGCC 1260 

ATAAGCCATC ACTGGAACTC CAACCCCAGA GGTCCAGGAG TGATCTCTGA GTGACTCAAC 1320 

^ AAAGACAGGA CACATGGGGT ACAAAGACAA GGCTTGACTG CTTCAAAGCT TCCCTCGACC 1380 

TGAAGCCAGA CAGGGCAGAG GCGTCCGCTG ACAAATCACT CCCATGATGA GACCCTGGAG 1440 

GACTCCAAAT CCTCGCTGTG AACAGGACTG GACGG1TGCG CACAAACAAA CGCTGCCACC 1500 

60 CTCCACTTCC CAACCCAGAA CTTGGAAAGA CATTAGCACA ACTTACGCAT TGGGG AATTG 1560 



WO 98/39448 



317 



PCT/US98/04493 



TGTGTATTTT CTAGCACTTG TGTATTGGAA AACCTGTATG GCAGTGATTT ATTCATATAT 162 0 

^ TCCTGTCCAA AGCCACACTG AAAACAGAGG CAGAGACATG TACTCTGGTG TGATCTCTTG 1680 

TCCTCAGTGT CTCTTCTGGG CTCCTGTCCC TCTTGCTTTA TAGCTAGCTC CCCGGGGACC 1740 

AAGGTACAGG TGAAAGCAAG GTAGCAGCTT GCGGGAGGAG GCCTGTCTGG CTTACCAGTC 1800 

10 TATACACTGT GGCCTCAACC TCCCAGACAG GGCAGAGAAC TGTGGGCAGC TCGTTTGCTT I860 

TCTAGGCTGG CTGGAGAGGT GGGAGCTCAT TGATAGACTC ATGATGGAAA CTATTTTTCA 1920 

15 AhCAGGCl ' VC CTCCTTCAGG AGAGATCATG CGGACTAAAC TGTAGCAATT CCAGTGCACC 1980 

TGGCAGTGAT CCTTTTCTTT GCAAAGTACT GTCTCTTTGG TTCCAGTAAG TTGGACCACC 2040 

ACATGACATY ATTTTCCCTG GAACCTGGTC ACTGACTAAC ACAGACAATT GGGACTCCAG 2100 

20 AGCCTCAAGA GCCAGGAGAG GGCACAGTAC ATACAGAGGG AGTCAAATGG GATCTCATTT 2160 

TGAGTCCTGC CTTCCGCACA CTCAGAACGG CANCCCCAAG GCCCGGAGTG TCCAGGGCTT 2220 

CTGGCCTGAG GTGAATCTGC CAGGCCCAAG AAGGCACAAA GGTAGGAGCA CAGAGAGCCC 2280 

25 

CATTCCCACA GGCGGKCGGC CCAGCAGCAC CAGTGGAAGC TCAGCTGTCC TCCAGCTGCT 2340 

CTCGGCAGAC AGTTCAGTGC ACAGTTTATG CCCTAGCTGA AAAAGATCTC CCGGACGTAT 2400 

30 TTCAGCACAT CCTCTTCCTC CTCCTCCTCA GGGCTCCTGC TACAGGCAGA GCTGGAACCC 2460 

CCCGGCCTCT GGGAAGGGCT GAGGCCTGGA GYCAGTGCCT GTC 2503 



35 



(2) INFORMATION FOR SEQ ID NO: 96: 



(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 2801 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

CTGGAAAGCC GAGGGTAGCC GAGCGGGGCG GGCGCTCTGG AGCGGCGGGT GCTCGGGCTG 60 

CCGTCCGCTC CGCCAGAAGC ACCGAGCAGC CGAGCCGGGG CCCGCCGCCC TCCTCCTCCA 120 

50 

TGAGGCCCGA GTGAGGCGCG GCGGCTATAG CCGACCCGCG GCGCCTTCCC CCCGCGTCCT 180 

ATCGCGAGCG CACGACMAGC GGCCCCTGGA GGAGGAGGCG GAGGAGGAGG AGCATGTCGG 240 

55 ACGGTTTCGA TCGGGCCCCA GGTGCTGGTC GGGGCCGGAR CCGGGGCCTG GGCCGCGGAG 300 

GGGGCGGGCC TRAGGGCGGC GGTTTYCCGA AMGGARCGGR GCCTGCTGAG CGGRCGCGGC 360 

ACCAGCCGCC GCAACCCAAA GCCCCGGGCT TYCTGCARCC AMCGCCGCTG CGCCARCCCA 420 

60 
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GGACGACCCC GCCGCCAGGG GCCCAGTCCG 
GGCCCGGGGC GCTCCCAGAG CAAACGAGGC 
5 AAATCCCACA GCAGAACTCG GAGTCAGCAA 
TATTAATGTC TAAGCTGTCT GTGAATGC CC 
GTTACACAGA ATCCTATGAG GATGGTTGTG 

10 

AGGATTTTTT GAATCATCTT ACAGAGCAGC 
TTGCAGAGAC CCTGAATGGT TGTGTTACAA 
15 TCATCTATCA ACAGGCCACA TCTATCCCAA 
ATTACCTGTC CCATCATCTG ACAATTAGCC 
TTCAAAGATG TCGGACTGAA TATGAAGTTA 

20 

CTCGAAAACG ATTTCATGCA TTTGTACTCT 
TCAAGGGAAC AAATGGACAG GTTACAAGAG 
25 TGCTGAATGC CCTGTTTTCT AATCCTATGG 
TAAAGTTGAC AGGATCAGTT TTGGAAGATG 
AAGAAATTAT TCAGAGAATT GAAAACGTTG 

30 

AACAGATGCT CTTGAAGCTT GTAGAACTCC 
CTTCAACATA TAGAGAAGCA ACACCAGAAA 
35 CATTTTATAC ATCTGATGGT GTTCCTTTCA 
ACCAAGAATT ACTTGAAAGA GAGGACTTTT 
TATCCGQGGC TGGTGATCCA TACTTGGATG 

40 

AAGAAGCTTA TGAAAAGTTT TGTTTGGAAT 
TTTCAGCATA TCAGTTTTAT AAAGCAGTTT 
45 GCAAGAAAAT GTGTCACATC TATACCAAAT 
GCAACTTTAA TTTTGTTTAA CACTATCTGC 
ATGTGTATAT ATATATAATA GTTTATTATG 

50 

AAAATCGATT TTGAAATAAA TGAAATGTTG 
CTTTAAATTC TACTTTTCTT GAGGGGAAAA 
55 AAAATGTAGC ATCCTTTTTT AGGTAGGAGT 
GTGTCCCAAT GAATTGAATT TCAAATATGA 
GTCTTGGGAA TATATCAACA ACTGATTTAC 

60 



AGGTCCCCGC CAGCCCCCAG CGGCCTTCCC 480 

CCCTGAGAGC TCCACCTAGT TCACAGGATA 540 

TGGCTAAGCC CCAGGTGGTT GTAGCTCCTG 600 

CTGAATTTTA CCCTTCAGGT TATTCTTCCA 660 

AGGATTATCC TACTCTATCA GAATATGTTC 720 

CTGGCAGTTT TGAAACTGAA ATTGAACAGT 780 

CAGATGATGC TTTGCAAGAA CTTGTGGAAC 840 

ATTTCTCTTA TATGGGAGCT CGCCTGTGTA 900 

CACAGAGTGG CAACTTCCGC CAATTGCTAC 960 

AAGATCAAGC TGCAAAAGGG GATGAAGTTA 1020 

TTCTGGGAGA ACTTTATCTT AACCTGGAGA 1080 

CAGATATTCT TCAGGTTGGT CTTCGAGAAT 1140 

ATGACAATTT AATTTGTGCA GTAAAATTGT 1200 

CTTGGAAGGA AAAAGGAAAG ATGGATATGG 1260 

TCCTAGATGC AAACTGCAGT AGAGATGTAA 1320 

GGTCAAGTAA CTGGGGCAGA GTCCATGCAA 1380 

ATGATCCTAA CTACTTTATG AATGAACCAA 1440 

CTGCAGCTGA TCCAGATTAC CAAGAGAAAT 1500 

TTCCAGATTA TGAAGAAAAT GGAACAGATT 1560 

ATATTGATGA TGAGATGGAC CCAGAGATAG 1620 

CAGAGCGTAA GCGAAAACAG TAAAGTTAAA 1680 

AGGTATGGTG ATTTAGCAGA ACACAAGAGA 1740 

TRAGGATGTT GAGTTATGTT ACTAATGTAT 1800 

CAAAATAAAC TTTATTCCCT ATAACTTAAA 1860 

TACAGTTAAT TCTACTGTTT TGGCTGCAAT 1920 

AAAATTTTGC TAGTTGGTTA GATGCTTATC 1980 

AGTCTTCGTC TGGAAATACA TATTACTGCA 2040 

ATTATAGCTT YCATTTTAGT TKGACATTTA 2100 

ATCATAATCT TGAAAATCTT TAGCACTAAA 2160 

ATATGCAGAT GCTATTTGMA TACCAAGGGC 2220 
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TTTTTAAATG TCATGGGGGG GAAAAACCCA 
GACTTCACTG GAAGATTTAT TCCAATTCTA 
5 AACTGRCTAA CTTCATTACC TTAAAGCCTA 
TCTCACTTTT ATTTTGTAGC AKGGGTTGCA 
TATTTGTCAT TCAAGTTTTC ATCTGCTTTA 

10 

AATACTTTTA CTATAATGTG GTACCACCTC 
TCAAATCTTT TTCCAGCTAA CTAAAAACTG 
15 GTAAATAGTT CTGTTAATAA CCCACTCTTT 
AGTTAGCTTT CTCACTTTTC TGCTTGTTTG 
ATAAAGCTTA AAAAAAAAAA AAAAAAAAAA 

20 



ACTTGGTGGA ACTCCCAGCT AAACAACCAA 2280 

GGAATTGTTC TTTTTTATTT TTATTTTTTC 2340 

GAACATTATT CTGCTTTATT TATATGGCTT 2400 

TCGACTTTTT TACTAGAGAA TTTTACTAGA 2460 

TAATTGATAC ACCTTGAGGG TCACTTTTCT 2520 

AGCCCTAATA AATAATATTT TTACCTAATG 2580 

TGTACAAAAG GATTGCTTGT AAATATGCAT 2640 

TACATTTGGT ACATCTGTGT CTGCTAATAC 2700 

TTCAGTCTGA ATTAAAATTA GACTTTGAAA 2760 

AAAAACTCGA G 2801 



25 



(2) INFORMATION FOR SEQ ID NO: 97: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1631 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
30 <d) topology: linear 
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(xi> SEQUENCE DESCRIPTION : SEQ ID NO: 97: 

ATGGAGCCAA AGACAATCAC TGATGCTTTG GCTTCTAGTA TAATTAAGAG TGTGCTGCCT 60 

AATTTTCTTC CATACAATGT CATGCTCTAC AGTGATGCTC CAGTGAGTGA ACTGTCCCTC 120 

GAGCTGCTTC TGCTTCAGGT TGTCTTGCCA GCATTACTCG AACAGGGACA CACGAGGCAG 180 

40 TGGCTGAAGG GGCTGGTGCG AGCGTGGACT GTGACCGCCG GATACTTGCT GGATCTTCAT 240 

TCTTATTTAT TGGGAGACCA GGAAGAAAAT GAAAACAGTG CAAATCAACA AGTTAACAAT 300 

AATCAGCATG CTCGAAATAA CAACGCTATT CCTGTGGTGG GAGAAGGCCT TCATGCAGCC 360 

45 

CACCAAGCCA TACTCCAGCA GGGAGGGCCT GfTTGGYTTTC AGCYTTACCG CCGACCTTTA 420 

AATTTTCCAC TCAGGATATT TCTGTTGATT GTCTTCATGT GTATAACATT ACTGATTGCC 480 

50 AGCCTCATCT GCCTTACTTT ACCAGTATTT GCTGGCCGTT GGTTAATGTC GTTTTGGACG 540 

GGGACTGCCA AAATCCATGA GCTCTACACA GCTGCTTGTG GTCTCTATGT TTGCTGGCTA 600 

^ ACCATAAGGG CTGTGACGGT GATGGTGGCA TGGATGCCTC AGGGACGCAG AGTGATCTTC 660 

CAGAAGGTTA AAGAGTGGTC TCTCATGATC ATGAAGACTT TGATAGTTGC GGTGCTGTTG 720 

GCTGGAGTTG TCCCTCTCCT TCTGGGGCTC CTGTTTGAGC TGGTCATTGT GGCTCCCCTG 780 

60 AGGGTTCCCT TGGATCAGAC TCCTCTTTTT TATCCATGGC AGGACTGGGC ACTTGGAGTC 840 
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10 



15 



20 



25 



30 



CTGCATGCCA AAATCATTGC AGCTATAACA TTGATGGGTC CTCAGTGGTG GTTGAAAACT 900 

GTAATTGAAC AGGTTTACGC AAATGGCATC CGGAACATTG ACCTTCACTA TATTGTTCGT 960 

AAACTGGCAG CTCCCGTGAT CTCTGTGCTG TTGCTTTCCC TGTGTGTACC TTATGTCATA 1020 

GCTTCTGGTG TTGTTCCTTT ACTAGGTGTT ACTGCGGAAA TGCAAAACTT AGTCCATCGG 1080 

CGGATTTATC CATTTTTACT GATGGTCGTG GTATTGATGG CAATTTTGTC CTTCCAAGTC 1140 

CGCCAGTTTA AGCGCCTTTA TGAACATATT AAAAATGACA AGTACCTTGT GGGTCAACGA 1200 

CTCGTGAACT ACGAACGGAA ATCTGGCAAA CAAGGCTCAT CTCCACCACC TCCACAGTCA 1260 

TCCCAAGAAT AAAGTAGTTG TCTCAACAAC TTGACCTTCC CCTTTACATG TCCTTTTTTG 1320 

TGGACTTCTC TCTTTGGAGA TTTTTCCCAG TGATCTCTCA GCGTTGTTTT TAAGTTAAAT 1380 

GTATTTGACT TGTGTTCTCA GCATTCAGAG AGCAGCGGTG TAAGATTCTG CTGTTCTCCC 1440 

TGGATCrTCT GACATTACTG CTGTCTGAGA TTTGTATATG TGTAAATACA AGTTCCITGA 1500 

TACCCTAAAA CCTTGGATTA AACAGAATGT GCATTGTACA TCTTTAAACA AAATGTATAT 1560 

TAATTTATTA AATCTAGTTG TCACTTTAAA AAAAAAAAAA AAAAAACTCG AGGGGGGCCC 1620 

GGTACCCAAA T X631 



35 



40 



45 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 504 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

CCGAGCTGGG CGAGAAGTAG GGGAGGGCAC GAGCCGCCGC OGTGGCGGTT GCTATCGCTT 60 

CGCAGAACCT ACTCAGGCAG CCAGCTGAGA AGAGTTGAGG GAAAGTGCTG CTGCTGGGTC 120 

TGCAGACGCG ATGGATAACG TGCAGCCGAA AATAAAACAT CGCCCCTTCT GOTTCAGTGT 1R0 

GAAAGGCCAC GTGAAGATGC TGCGGCTGGA TATTATCAAC TCACTGGTAA CAACAGTATT 240 

CATGCTCATC GTATCTGTGT TGGCACTGAT ACCAGAAACC ACAACATTGA CAGTTGGTGG 300 

AGGGGTGTTT GCACTTGTGA CAGCAGTATG CTGTCTTGCC GACGGGGCCC TTATTTACCG 360 

GAAGCTTCTG TTCAATCCCA GCGGTCCTTA CCAGAAAAAG CCTGTGCATG AAAAAAAAGA 420 

AGTTTTGTAA TTTTATATTA CTTTTTAGTT TGATACTAAG TATTAAACAT ATTTCTGTAT 480 

TCTTCCAAAA AAAAAAAAAA AAAA 504 
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(2) INFORMATION FOR SEQ ID NO: 99: 
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(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1416 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

GGCACGAGGG AGGGAGCCCT CTCCGTTGGG TGACTCTTGT GTGCCCTTTA GACAGGCTGG 60 

CCTGCCGGTT CCACAGGGTA CAGTTAGGAC TTGAGTCTTT CTTTTTCTGT TTTGAGTTGG 120 

TGAGTGAGTG ATAGGGTAAC ATGGGCCTTC AGGATGACCC CTTGGAACTG TGCCGAGTTC 180 

CTTAAATCTC AGCTGGGATC CTGGACCTGG GAGGCCCCTG TGAGGGCCAG CTCTGGAAAA 240 

ACCTGGGAGT TGATGCCGGA GCTGTGGAAG AACTCTGCTC GAGGGCAGGG TGCCCTGGAA 300 

CACTGGTAGT TCTGGGGCTG GGAGGGAGAG GGGCTCCGGC TTTCTCTGAA ATGAACACTG 360 

CTCTTCAGCA GTTCAAGTAC TTGTTCTCAA AACATTTTCT AATTGATTGG TAGGTTTTCA 420 

TAAGCATTCT TTCTTTAAGG CATGGAAAGG GAAGAATGCT CAAGCAAGTC ATGTTTGTTT 480 

TCAGTGGGAT GGGCCCGCGT TCTCACTGCT GGGGGCTTCC CCTTCATGTG GCACCTTTGT 540 

GCAGGGGCCA CCAGGCAGAC TCTTCCCACC TTCTCCCACT GAAGCACCAA GGGGCTTGGA 600 

ACCGTAATTT GGCTAATCAG AGGCATTTTT TTTGTCCTAG TATCTTTCAC ACTTGTCCAA 660 

CCGTCTTATT TTTTTAAAAG TTCTGTTCCT TGTATTAACA CGAAACTAGA GAGAAATAGT 720 

TTCTGAAGCC AGTTTATTGT GAAGATCCC C AAGGGGAGGT TCGGTAGAGA AAAATAGTAA 780 

GCTGGTTTAG AAACTGACGA GGGCAAACAG CCAGGACGCA 1TGGAGAGGA ATTTGCCAAA 840 

GATCTACCCT GAGATAACGC CTGTCCAGTG TCTTCACCAC GK5AATAACC AGCGCTCCAA 900 

AGTGTTTTTC TGCTTTGAAA AAAAAAATTC CACAAGCTTT TAAAGGTGCA TTTAAGAATC 960 

CATGTGACTT TAGAATGGAA CTGCCGGCCC TGGCAACTGT CACGTGTGCT AGAAGGTTCG 1020 

ATGCCTCTGG AATGCATGTG ATACTCATCT CCMTTTCTT TCCTTGATO3 CATTTTrGTr 1080 

CTTTTAGCAG ATCTGTCCCT GTGGGTGGTG TCTAAGAAGT CGGACACCTT GGTTTTCCTG 1140 

TTAGATTGAG CTGGGCAGCT GCAATCAGCT TCTTTATATG CAAATTAGGC ACGACCCATC 1200 

TC?TGGTTCCT GGTTGGTGGC TAATGAAGTG AGGGGAGGGA GGGATGTCAC CCCAAAAGTA 1260 

GGCCCTCCCA TTGGCTTTGG CCAGGCCAGA CACTTCACAT CGTTTACATG GTTCTGTGTA 1320 

ATTTTAAAGT TTATGTGTAT AAAGCGAAGC TGTTTCTGTG AAACTGTATA TTTTGTAAAT 1380 

AAATATATTG CTACTTGAAA AAAAAAAAAA AAAAAA 1416 
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5 (2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2847 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xii SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

15 GGCTAGGACA ATTTTGGTGC TTTACCTATC TCTGCAAAGA CTGGAGAATT TGGCATACCA 60 

TTAATTACAA CCACCAATCA TATCCAACAA AAGTACCCTA AAAGAAGGAC CAGTGGCCAC 120 

TCTCGAAAAA ATTTAAGTAT CAGAAGATTA AAAAGATTTT AGGATTTGGA AGCTTGTATT 180 

20 

GTCTTTCCCC AATAATCA1T G1*1TGATCTC CAAATAGTAG CCTTATATTA GCAATRGACA 240 

GATCATTGGT TCTCCATATC TGATCATATG TTACTACTTT GGAATCAGTA TTTGGGCAAA 300 

25 TTCAAGCATT TATGCAGTGG ATATAAATGG AAATATAAAA ATATTTGCCA ACCTGTCTCA 360 

GTAACTTATC ATATCTCTGT GNATCCTCAA GGAAAGCACT TTTGCTTTTA CTTAGAAAGC 420 

GTTTCAGATT TGCTTTATAG ACTCCTGCTG TCTTCAGTAC CTGATAAAAC TTTAACCAGG 480 

30 

GAAGCATTAA ACACAGTGCA GCAGCTTTTG CCCAGGCTTC TAAGTTCCTG CCGGCAGCAT 540 

TTATCAATGT AAGAACTAGG ATGCTTCCTG CAGTGGCACT ACOTCCCCT AGAGCTGGAG 600 

35 CATGCTGCTT GGCCTTAAGC CCCAGCATGA TGAGGCTTCC CTCCTGCCAG GTCAGTAAAA 660 

GTTAGAGAGC TCAGAATTGG GTCTTGCCTG GGTGCAGGTG GCAGGGTTTG CTGAAACCCC 720 

TAAAGAGAAG TCACCAAGGG AGGCAGGTAA TGAATGTTTC CAGAATCAGT CKGATACTCA 780 

40 

TAGCAATTTC TGGCTATCTT TCAAATGTTG AATTTCTGGA TGCTGAGAGG GACTTT3ATT 840 

TGATATCATT AAATCCAGGA CAGTCCCAAG AAGTGCTTGG AGTCTCGGCT CTGACAGCCC 900 

45 AAGAAGGGAA ATAACTTGTA TTAAGGAACA ACTATGAGCC AGGCCCTGAG CTGTCTCTTA 960 

GATAATAAAA CAGATGGGGA GTGGAAGAGT CATTTGCTTC AAGTTATACA GCTAGGAAAT 1020 

ACTCAAGCCA AATCTTGAAC GCAGCTCCCC CTAATTCTGT GGACAGGCAC TTTGTACCAC 1080 

50 

ACACCATGGT CCACCTAAAA ACAGAAGGAT AAAAAGACTT CAGGTTTTCC CACTGTGTGC 1140 

TGACCATCCC AATTTATGAA TCTTCTTCAA AATGACATTT CACAGTTATA GTTAGGGCTC 1200 

55 AGAAATGGCA TTGAGGTAGC CTTATTTCTC CCCTTTAGCA GATGCTTTAA GTACACATTG 1260 

CTGACTTGAG CCCACCCCCA GGAGTTAGGA GAACATTTCC TTTTTCATGC CATCTTCCAT 1320 

AAATAAGGTG TTTCTTGGCC TTCAAAGATA TAGAACTTTG CAGCAGTAGT AAAAGTGAAG 1380 

60 
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GGTGTTCTGC TCTCTACTCA ACTTTATTTG 
AGGAAATCTT CATATTTTAG TAAACTTAGC 
5 TCAAAGTCCA GTGAATCTGG CTCTCTTACT 
AGTGTGTACC TATATATAAA GGACAAGTGT 
CATGTCCACA CACACACACA CAATATTTGA 

10 

ATTATCTTGC GTACTACTTC AAAGATTTCT 
GTTGGGGCTT AGGTACTTGC TTACAGGAAG 
15 TAAGGCACTG AGTCAAAGTG ACAGCCCTGA 
TGTCTAATAA GATGGGAAAC TTGGATGCCC 
CTCCAGTTAG ACCTAAGGGC ACAAATGCAG 

20 

TAGGAAGTCC TCTCTCCCCA AGTAGAAAAT 
CATATAATGG CTGTGCAATA CATGCTTCTC 
25 TGCTGATCAC ATCAGATTTT TATGTTTAAA 
GCTCTAAGAG AAAAAGAAGG CCCATATGGG 
CCAGCAGTGC TTATRAAGCC CCCTACCCTG 

30 

GTTCTTGATT CTGGAGGCCT GCCTGGTAAG 
TACCAGAAAC AGCAGAACGA GGGCCAGAGC 
35 GGATACATTG GTGCAAAAAA AGCCACGGGS 
GACAGCAGAT TAATACTTAA TGAGGGTTAA 
ACACTGCATG AATGGGGAGA ACCAATGAAT 

40 

AGTCACATTC CCTCCTTAGG AATCTTCCCC 
TGAATCTTTC AAGGGAATTA CACGTTTGGG 
45 TAAATTATTT TGTAAGAGAG ATTTACTGCT 
TGCATTTGGA AATGAATAAA CTATTACTGG 
NAAAAAACTC GAGGGGGGCC CGTACCC 

50 



AAAATGTCTG CAGCTTCACT CCTGTAGAAA 1440 

CGCCAGTGTA CTCTGTGAGG ATGTGGCAAT 1500 

GATTCCTGGT TTTAGTGTGT GTGTCGGGGG 1560 

GATATGTGTG TATATGTATA TACATACATA 1620 

GAGCTAAGGA AAACTCAAAG CAGCCCCTTC 1680 

GTCAGCCCTA ATTACAAGTG TCACCATATA 1740 

AGCAATTCCC TAGCAAAGGT CATTAGCTCC 1800 

AGGAAATTGC ACTCCAGCCC TCCTCCAGGA 1860 

AGCCATTTTG GTGACCTGAG AGTCTAACTA 1920 

AATTCATGAC CTTGTAGTTG TGGCAGGGTC 1980 

ATTCTCTTGC CATTCCTGAA ATTCCACATT 2040 

AATAAGAAAA TTAACTGCAT GTTTACTGTG 2100 

AAAATCTCAT TATGGNTTGA GTCCAGCCCA 2160 

AGACTTCAGT CTCATTATTA TTGCCTTTAT 2220 

TCCCATTCCA GAAACCATAA GACTCAGGCA 2280 

ATAAGATAGT ATAATTTGGA ACTGAGAACA 2340 

AGAAAAATGA AAATAAGTGG AGACACTTAT 2400 

CCCATACTGG GCTTGATATG ACTTTGAGGG 2460 

ACCTGACCAG TCTTTCTACA GTGACAGGCC 2520 

CCATTGTCCT CTGCCTATTT TCCTGTGCAC 2580 

TTCCACCCTT TACATTAAAC AAGGGAACAC 2640 

TTAATGTTTC AGTATATCAT TTTCATACTG 2700 

ATCCCAGGAT GTTCGGACTT GGTGCCCCTG 2760 

AAATGCCAAA AAAAAAAAAA AAAAAAAAAN 2820 

2847 



55 



(2) INFORMATICS! FOR SEQ ID NO: 101: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1394 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doable 
60 (D) TOPOLOGY : linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

^ GAGATTGGTG GAGGAGAGTA AATAATCTAG AGGCAAGAGT TCAGTGAGGG CCAAGGGGGA 60 

CCCCCAGAAA AAGGTATGGA GCTAACTCAT CTCTTTTACA AGGGGTGGCC ATGACTTACT 120 

GTTGCAAAGT ACTCAGTGTA TATTTAATGT TGATTGTTGA ATTTTAGTTA CGAGAGGGAA 180 

10 GAACAATTTT ACTTCTGTCC TTATTTCACT TGCTGAAAAG CTGTGGGACA AAATGTATGG 240 

AATAGACAAG GCCACTTTCT TTGTGATTTC TGCTTTTCAT GCATATTATT TTATTTACCC 300 

^ ATAATTTCCA AGAGGTTTGG CGTTCCGCTC TCCTGCTTTT TTCTTTCATC CACCCCTTTC 360 

CTITTTTTGG AAGGGGGTTA TATATGAGAG TTCATTGAAG AAGTCCAGTG AGGCTGAAGT 420 

AAAGGGGCAA GATAGGGCAG TTAACTAAAG AGCACTTTAT TTCTTTGAAG CCTTTCTAAG 480 
20 AAAGAAATGG GGGTGCGAGT GGCTTGAATC TCCCATGATG TTGGAGGGCA CTTAGTGGGG 
TTGAAGTATG ACATAATATT TCCCATTGGG GAAAGGAGAA TTTCTCTTAG AGGGTGGCAA 



25 



30 



35 



GGGTGTGTCC TATACAAAAC TTCCCATCAG TTCTCCTCAA TATTCCCCAT TTGTAAATGA 



ATCATACATG CAAATGCCCC TTGTTCATCT GTGTCTTCTG CAAACTAGTC TCATGAAGAA 



540 
600 



AATGCCTTTG CCCAGTGTCC CTATTTTAGG CATCTTTTCC TTCCTTATTC CTTCCAGTCA 660 



720 



TCACTTCTCT TTTCTAAACC CTTTTCCTGT TCAGATCCAT ACAGGATTTG CAAGGGTAGG 780 



840 



TTCTGGCGTG CAGCAGGGTA GCTGAAGTTT GGGTCTGGGA CTGGAGATTG GCCATTAGGC 900 

NTCNCTGAGA TTCCAGCTCC CTTCCACCAA GCCCAGTCTT GCTACGTCGC ACAGGGCAAA 960 

CCTGACTCCC TTTGGGCCTC AGTTTCCCCT CCCCTTCATG AAATGAAAAG AATACTACTT 1020 

TTTCTTGTTG GTCTAGCATT GCTGGACACA AAGTGTAGTC ATTATTGTTG TATTGGGTGA 1080 

40 TGTGTGCAAA ACTGCAGAAG CTCACTGCCT ATAAGAGGAA ATAAGAGAGA AAGTGGAGGA 1140 

GAGGGACAAA AGGAGTAATT ATTTGGTATA GATCCACCCA TCCCAACCTT TCTCTCCTCA 1200 

GTCCCTGCTC CTCATGTTTC TGGTTTGGTG AGTCCTTTGT GCCACCACCC ATAATGCTTT 1260 

45 

GCATTGCTGC ATCCTGGGAA GGGGGTATAT GGTCTCACAA GTTGTTGTCA TTGTTTTTTT 1320 

GCATGCTTTC TTAATAAAAA AAAAAAAAAA ATGTTTANAG TTTTATCTTA AAAAAAAAAA 1380 

50 AAAAAAAAAA ACCC 1394 



55 (2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS : 

{A) LENGTH: 794 base pairs 
(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS: double 
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30 



35 



45 



(2) INFORMATION FOR SEQ ID NO: 103: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1544 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 102: 

GGMRCGAGGC GGAGTAAAGG GACTTGAGCG AGCCAGTTGC CGGATTATTC TATTTCCCCT 60 

CCCTCTCTCC CGCCCCGTAT CTCTTTTCAC CCTTCTCCCA CCCTCGCTCG CGTACCATGG 120 

CGGAGCGTCG GCGGCCACTC AGTCCCATTC CATCTCCTCG TCGTCCTTCG GAGCCGAGCC 180 

GTCCGCGCCC GGCGGCGGCG GGAGCCCAGG AGCCTGCCCC GCCCTGGGGA CGAAGAGCTG 240 

CAGCTCCTCC TGTGCGGTGC ACGATCTGAT TTTCTGGAGA GATGTGAAGA AGACTGGGTT 300 

15 TGTCTTTGGA CACGCTGATC ATGCTGCTTT CCCTGGCAGC TTTCAG7GTC ATCARTGTGG 360 

GTTTCTTAMC TCATCCTGGC TCTTCTCTCT GTCACCATCA RCTTCAGGAT CTACAAGTCC 420 

GTCATCCAAG CTGTOCAGAA RTCAGAARAA GGCCATCCAW TCCAAAGCCT ACCTGGACGT 480 

20 

AGACATTACT CTGTCCTCAG AAGCTTTCCA TAATTACATG AATGCTGCCA TGGTGCACAT 540 

CAACAGGGCC CTGAAACTCA TTATTCGTCT CTTTCTGGTA GAAGATCTGG TTGACTCCTT 600 

25 GAAGCTGGCT GTCTTCATGT GGCTGATGAC CTATGTTGGT GCTGTTTTTA ACGGAATCAC 660 

CCTTCTAATT CTTGCTGAAC TGCTCATTTT CAGTGTCCCG ATTGTCTATG AGAAGTACAA 720 

GACCCAGATT GATCACTATG TTGGCATCGC CCGAGATCAG ACCAAGTCAA TTGTTGAAAA 780 

GATCCCAAGC AAAA 794 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 103: 

TTTGCTTGCT AGTCTGAACC AAAGAGTTGT TTGGGCATTT GCTGTGTTGG CCATTTCTGG 60 

AGCAAGAGGG TCTTCTTCCT CCTTCCCCCA GCCAGCCAGC TGTCCTGGGG CCAGGCTTTC 120 

50 CTGGGTGGAA AGAAGTATAC CTTTCCCTGG GGCCCTAGGA TAGCAAAGTG AGCCATAGTG 180 

GGCCAGGCTG CCCTCCATGC TGGGCCCCAG CCCAGGTCTG CACTCGCCTG GATCACCTTC 240 

TTTGAGCCTT AGCCATCTCC TGTCAGGTAG GAATGAACTT GCCAGCCTTC AGGYTCGTTC 300 

55 

AGCTATGACC ATCTGTGCGG TCAGGGTACA CTCAGCTCTC CTCCCCAACT CCAGCAGCCT 360 

TTAAGAAGTG TCCCTTTGGC GCCCCCTQGA GGCAGAGCAC TGAGCTGGAC CCTGGGTAGA 420 

60 CTCCCACAGG GAGGACGGAG CTGGCCTCAG GAGTGGGACA CCCAGACTTG GCAGGGCCTT 480 
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15 



CAAGAGGCCT GTGTGGGGGC CCCAGGAATC CTTAGCTGAA GCGGGGAGAC TCACTCTCCA 540 

TCTCAGGAAA TTCTAGCCCT TGCCCTCAGG GAGCCACGGT TGAGGGTGAG GCCCAACACC 600 

TGCCTTAGGG CCCTGGGTGG GCAAGTCTGG GCCCTGGGGT AGGGAGGGAG ACTCAGGCCC 660 

ACACTTGGGT ATTTTCTAAT TTCAGACAAA CACACACTCA GCGCGCACTC ACTGATTCCT 720 

ACACATTGCC AAGATTTCAC ACATGTGACC AGGGGCCACC AAAGTCCCTG TGACCTTTGT 780 

GACTAGGATC CTAATTTCTC TATTTTCTCC TGGGTGCCTG GGTCTGTGTC ACCTGGGGCA 840 

GTGTGGATAA TGTTTAGTTC TGTGACACTG TTTTTTGGGG GTGGCACCTG GTTCTCCGAT 900 

GCCTGGGCTG GTGTCAGGCC CAGGACTGTA GTGCTGGGAG CAGTAAAGCT CAGCTCTGTG 960 

TAATGAGTGA TGCTATGGCT TGCTCGTGTC TTATGATCCA ATCCTTTTCT ACATCAGCCC 1020 

20 TTGTTTTGTT TTATGGCTAG TCTTATCTGG CCTGGTTATT TCCTTGCGGG GAGGAGAGGG 1080 

TTTGCTAATC TGCTCCCAGC CCAACCTATT ACCACCCCAC CTCGCTGGGA CCTACTGCTC 1140 

GGGAGGCAGC AGACAGGGAG CCACCAGCAG TGGCTTCCTG GCCCTGTGCT GGGGGTGGGG 1200 

25 

GGAAGCTGGG GGCACATGTG GCCCTTGCCT TCTGAGCAGC TCCCAGTGCC AGGGCTTTGA 1260 

GACTTTCCCA CATGATAAAA GAAAAGGGAG GTACAGAAGT TCCAATTCCC TTTTTATTTT 1320 

30 GCTGGTTGGT ATCTGTAAAT GTTTAATAAA TATCTGAGCA TGTATCTATC AACGCCAAGA 1380 

ATTTCAAAGT CTCCTTCAAC AATATGAGGC TTTTAGGATG TTTATATTCC TTCATCCCTC 1440 

TTGTTTCCCA GGTTITGCAG GGAAAAAAAG TCTGGAATTA TAGATACAGC TTATTATTAA 1500 

ATTTGTTCTT GCATAAAAAA AAAAAAAAAA AACNCNNGGG GGGG 1544 



35 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 104; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 871 base pairs 
45 <B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

ACCCACGCGT CCGNCTTCTC CACCCGGGGG CGTGGGAGTG AGGTACCAGA TTCAGCCCAT 60 

TTGGCCCCGA CGCCTCTGTT CTCGGAATCC GGGTGCTGCG GATTGAGGTC CCGGTTCCTA 120 

55 AGGTGGGTCG CTGTCCACCC GGGGGCGTGG GAGTGAGGTA CCAGATTCAG CCCATTTGGC 180 

CCCGACGCCT CTGTTCTCGG AATCCGGGTG CTGCGGATTG AGGTCCCGGT TCCTAACGGA 240 

CTGCAAGATG GAGGAAGGCG GGAACCTAGG AGGCCTGATT AAGATGGTCC ATCTACTGGT 300 

60 
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CTTGTCAGGT GCCTGGGGCA TGCAAATGTG 
TCCGAAGCCT TCCCCGACAT ACCTTCGGAC 
5 TCCACATCTC CATGGGCTGT GCCTTCATCA 
GGGCTCAGCT CACATTCTGG GAGGCCAGCC 
TGGCCACTGT CAACGCCCGC TGGCTGGAAC 

10 

AAACCGTGGG AGAAGGAGCG AGGCCTGGGT 
GATCCTTAAC GCCAGNTGCG AGAGAAGGAC 
15 TTCCGCTACC ATGGGCTGTC CTCTCTTTGC 
TGTCTCGCTG GCCTTGCCCT GGAAATAAGG 
AATGCTTCTT CAGAAAAAAA AAAAAAAAAA 

20 



GGTGACCTTC GTCTCAGGCT 1TCCTGCTTT 360 
TAGTGCAGAG CAAACTCTTC CCCTTCTACT 420 

ACCTCTGCAT CTTGGCTTCA CAGCATGCTT 480 

AGCTTTACCT GCTGTTCCTG AGCCTTACGC 540 

CCCGCACCAC AGCTGCCATG TGGGCCCTGC 600 

GGGGAGGTAC CAGGCAGCCA ACAGGTTCCC 660 

CCCAAGTACA GTGCTCTCCG CCAGAATTTC 720 

AATCTGGGCT GCGTCCTGAG CAATGGGCTC 780 

AGCCTCTAGC ATGGGCCCTG CATGCTAATA 840 

A 871 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 404 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

GGCACGAGTT ATAGCATGGC ATTCATACTT TTGTTTTATT GCCTCATGAC TTTTTTGAGT 60 

TTAGAACAAA ACAGTGCAAC CGTAGAGCCT TCTTCCCATG AAATTTTGCA TCTGCTCCAA 120 

AACTGCTTTG AGTTACTCAG AACTTCAACC TCCCAATGCA CTGAAGGCAT TCCTTGTCAA 180 

40 AGATACCAGA ATGGGTTACA CATTTAACCT GGCAAACATT GAAGAACTCT TAATGTTTTC 240 



25 



30 



35 



45 



300 



TTTTTAATAA GAATGACGCC CCACTTTGGG GACTAAAATT GTGCTATTGC CGAGAAGCAG 
TCTAAAATTT ATTTTTTTAA AAAGAGAAAC TGCCCCATTA TTTTGGTGGG GTTGGTTTTT 360 
AATTTNTAAT NTGAAAAATT TTTTTGGGGT TTTTGGGGCC ATGG 404 



50 

(2) INFORMATION FOR SEQ ID NO; 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1542 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 



10 



30 



50 



240 
300 
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GTCAGACAGG TGGAGCCGCC GGGGCAGGAG TCTCAAAGAG CCAGGCTCCA GGAGAGGAAG 60 

GGCTCTRCGA GAGGAGAGAG GAGAGCGCTG GAGAGGAGAG GCTGGAGAGT CCTTAGCCAG 120 

GATGGAGGCT GTTGTGAACT TGTACCAAGA GGTGATGAAG CACGCAGATC CCCGGATCCA 180 
GGGCTACCCT CTGATGGGGT CCCCCTTGCT AATGACCTCC ATTCTCCTGA CCTACGTGTA 
CTTCGTTCTC TCACTTCGGC CTCGCATCAT GGCTAATCGG AAGCCCTTCC AGCTCCGTGG 

CTTCATGATT GTCTACAACT TCTCACTGGT GGCACTCTCC CTCTACATTG TCTATGAGTT 360 

CCTGATGTCG GGCTGGCTGA GCACCTATAC CTGGCGCTGT GACCCTGTGG ACTATTCCAA 420 

15 CAGCCCTGAG GCACTTAGGA TGGTTCGGGT GGCCTGGCTC TTCCTCTTCT CCAAGTTCAT 430 

TGAGCPGATG GACACAGTGA TCTTTATTCT CCGAAAGAAA GACGGGCAGG TGACCTTCCT 540 

ACATGTCTTC CATCACTCTG TGCTTCCCTG GAGCTGGTGG TGGGGGGTAA AGATTGOCCC 600 

20 

GGGAGGAATG GGCTCTTTCC ATGCCATGAT AAACTCTTCC GTGCATGTCA TAATGT AC CP 660 

GTACTACGGA TTATCTGCCT TTGGCCCTGT GGCACAACCC TACCTTTGGT GGAAAAAGCA 720 
25 CATGACAGCC ATTCAGCTGA TCCAGTTTGT CCTGGTCTCA CTGCACATCT CCCAGTACTA 



780 



CTTTATGTCC AGCTGTAACT ACCAGTACCC AGTCATTATT CACCTCATCT GGATGTATGG 840 

CACCATCTTC TTCATGCTGT TCTCCAACTT CTGGTATCAC TCTTATACCA AGGGCAAGCG 900 

GCTGCCCCGT GCACTTCAGC AAAATGGAGC TCCAGGTATT GCCAAGGTCA AGGCCAACTG 960 

AGAAGCATGG CCTAGATAGG CGCCCACCTA AGTGCCTCAG GACTGCACCT TAGGGCAGTG 1020 

35 TCCGTCAGTG CCCTCTCCAC CTACACCTGT GACCAAGGCT TATGTGGTCA GGACTGAGCA 1080 

GGGGACTGGC CCTCCCCTCC CCACAGCTGC TCTACAGGGA CCACGGCTTT GGTTCCTCAC 1140 

CCACTTCCCC CGGGCAGCTC CAGGGATGTG GCCTCATTGC TCTCTGCCAC TCCAGAGCTG 1200 

40 

GGGGCTAAAA GGGCTGTACA GTTATTTCCC CCTCCCTGCC TTAAAACTTG GGAGAGGAGC 1260 

ACTCAGGGCT GGCCCCACAA AGGGTCTCGT GGCCTTTTTC CTCACACAGA AGAGGTCAGC 1320 

45 AATAATGTCA CTGTGGACCC AGTCTCACTC CTCCACCCCA CACACTGAAG CAGTAGCTTC 1380 

TGGGCCAAAG GTCAGGGTGG GCGGGGGCCT GGGAATACAG CCTGTGGAGG CTGCTTACTC 1440 

AACTTGTGTC TTAATTAAAA GTGACAGAGG AAACCANAAA AAAAAAAAAA AAAAACTCGA 1500 

GGGGGGCCCG TACCCAAATC GCCGGTATGA TCGTAAACAA TC 1542 



55 

(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2327 base pairs 
60 (b) TYPE: nucleic acid 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

5 

GGTAGCTCAN TGCAGTGAAA TAGTCTTACT GGAAACAAAG CCCTTTATCA AGAATAATTA 60 

ACTCTTCCCT TITCTTTTTG GAGAGGTGCT TTGTTTCTGA TCGGACCATT TCACTGCAGC 120 

10 AAGCAACACA GTATTCTRAG CAGAAGATCG GGACTTGAGG CCATGTTGCG GAGGGCCAGT 180 

RACATTATCT GGACTCTGGA GTGTGAGGAA TATGGACTCC ACTCTTCACT ATATTCACAR 240 

CGATTCAGAC TTGAGCAACA ATAGCAGTTT TAGCCCTGAT GAGGAAAGGA GAACTAAAGT 300 

15 

AGAAGATGTT GTACCTCAGG CGTTGTTAGA TCAGTATTTA TCTATGACTG ACCCTTCTCG 360 

TGCACAGACG GTTGACACTG AAATTGCTAA GCACTGTGCA TATAGCCTCC CTGGTGTGGC 420 

20 CTTGACACTC GGAAGACAGA ATTGGCACTG CCTGAGAGAG ACGTATGRGA CTYTGGCCTC 480 

AGACATGCAG TGGAAAGTTC GACGGAACTC TAGCATTCTC CATCCACGRG CTTGCAGTTA 540 

TTCTTGGAGA TCAATTGACA GCTGCAGATC TGGTTCCAAT TTTTAATGGA TTTTTAAAAG 600 

25 

ACCTCGATGA AGTCAGGATA GGTGTTCTTA AACACTTGCA TGATTTTCTG AAGCTTCTTC 660 

ATATTGACAA AAGAAGAGAA TATCTTTATC AACTTCAGGA GTTTTTGGTG ACAGATAATA 720 

30 GTAGAAATTG GCGGTTTCGA GCTGAACTGG CTGAACAGCT GATTTTACTT CTAGAGTTAT 780 

ATAGTCCCAG AGATGTTTAT GACTATTTAC GTCCCATTGC TCTGAATCTG TGTGCAGACA 840 

AAGTTTCTTC TGTTCGTTGG ATTTCCTACA AGTTGGTCAG CGAGATGGTG AAGAAGCTGC 900 

35 

ACGCGGCAAC ACCACCAACG TTCGGAGTGG ACCTCATCAA TGAGCTTGTG GAGAACTTTG 960 

GCAGATGTCC CAAGTGGTCT GGTCGGCAAG CCTTTGTCTT TGTCTGCCAG ACTGTCATTG 1020 

40 AGGATGACTG CCTTCCCATG GACCAGTTTG CTGTGCATCT CATGCCGCAT CTGCTAACCT 1080 

TAGCAAATGA CAGGGTTCCT AACGTGCGAG TGCTGCTTGC AAAGACATTA AGACAAACTC 1140 

TACTAGAAAA AGACTATTTC TTGGCCTCTG CCAGCTGCCA CCAGGAGGCT GTGGAGCAGA 1200 

45 

CCATCATGGC TCTTCAGATG GACCGTGACA GCGATGTCAA GTATTTTGCA AGCATCCACC 1260 

CTGCCAGTAC CAAAATCTCC GAAGATGCCA TGAGCACAGC GTCCTCAACC TACTAGAAGG 1320 

50 CTTGAATCTC GGTGTCTTTC CTGCTTCCAT GAGAGCCGAG GTTCAGTGGG CATTCGCCAC 1360 

GCATGTGACC TGGGATAGCT TTCGGGGGAG GAGAGACCTT CCTCTCCTGC GGACTTCATT 1440 

GCAGGTGCAA GTTGCCTACA CCCAATACCA GGGATTTCAA GAGTCAAGAG AAAGTACAGT 1500 

55 

AAACACTATT ATCTTATCTT GACTTTAAKG KKWAWKMMWW KCTCAGMSRA TTATAWITSW 1560 

CWMMRARGSM WYMAAWSCTK SWGCTCYWCC KSRSTGPMKG MMRCTCTAGA AYTRGYRGAK 1620 

60 CMYYYKSGCT KMWGGAAKKS GGCASGAGCC AGAGACCTGC ATTGCTTTCT CCTGGTTTTA 1680 
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TTTAACAATC GACAAATGAA ATTCTTACAG CCTGAAGGCA GACGTGTGCC CAGATGTGAA 1740 

AGAGACCTTC AGTATCAGCC CTAACTCTTC TCTCCCAGGA AGGACTTGCT GGGCTCTGTG 1800 

5 

GCCAGCTGTC CAGCCCAGCC CTGTGTGTGA ATCGTTTGTG ACGTGTGCAA ATGGGAAAQG 1860 

AGGGGTTTTT ACATCTCCTA AAGGACCTGA TGCCAACACA AGTAGGATTG ACTTAAACTC 1920 

10 TTAAGCGCAG CATATTGCTG TACACATTTA CAGAATGGTT GCTGAGTGTC TGTGTCTGAT 1980 

TTTTTCATGC TGGTCATGAC CTGAAGGAAA TTTATTAGAC GTATAATGTA TGTCTGGTGT 2040 

TTTTAACTTG ATCATGATCA GCTCTGAGGT GCAACTTCTT CACATACTGT ACATACCTGT 2100 

15 

GACCACTCTT GGGAGTGCTG CAGTCTTTAA TCATGCTGTT TAAACTGTTG TGGCACAAGT 2160 

TCTCITGTCC AAATAAAATT TATTAATAAG ATCTATAGAG AGAGATATAT ACACTTTTGA 2220 

20 TTGTTTTCTA GATGTCTACC AATAAATGCA ATTTGTGACC TGTAAAAAAA AAAWAAAAAA 2280 

ACTCGAGGGG GGCCCGGTAC CCAAATCGCC GATATGATCT AANCATC 2327 



25 



(2) INFORMATION FOR SEQ ID NO: 108: 



(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1062 base pairs 

<B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO; 108: 

GGCCGCCGAG GCGCAACAGC CGTTCTGTCA GCTCTGGGTC CAACCGGACT AGCGAANATC 60 

TTCCTCATCC TCATCATCGT CTTCCTCATC CCGATCTCGG TCCAGGTCCC TCTCCCCCCC 120 

40 

ACACAAGAGG TGGCGAAGGT CCAGCTGTAG TTCCTCTGGA CGTTCTCGAA GATGCTCTTC 180 

CTCTTCTTCG TCATCATCTT CCTCTTCGTC TTCCTCATCC TCATCATCCA GTTCTCGAAG 240 

45 CCGCTCACGA ATCCCCATCC CCCCGCCGGA GRAAGTGACA GGAGGCGGCG GTACAGCTCT 300 

TATCGTTCAC ATGACCATTA CCAAAGGCAA AGAGTGCTAC AAAAGGAGCG TGCAATAGAA 360 

GAAAGAAGGG TGGTCTTCAT TGGAAAGATA CCTGGCCGCA TGACTCGATC AGAGCTGAAA 420 

50 

CAGAGGTTCT CCGTTTTTGG AGAGATTGAG GAGTGCACCA TCCACTTCCC TGTCCAAGGG 480 

GACAACTACG GCTTCGTCAC TTATCGCTAT GCTGAGGAGG CATTTGCAGC CATTGAGAGT 540 

55 GGCCACAAGC TGCGGCAGGC AGATGAGCAG CCCTTTGATC TCTGCTTTGG GGGCCGAAGG 600 

SWGTNCTGCA AGAGGAGCTA TTCTGATCTT GACTCCAACC GGGAAGACTT TGACCCAGCA 660 

CCTGTAAAGA GCAAATTTGA TTCTCTTGAC TTTGACACAT TGTTGAAACA GGCCCAGAAG 720 

60 
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AACCTCAGGA GGTAACCTTG GGCCCTTCCC TGCTATCCTT TTTCTCCTTT QGAGGTGCCC 780 

AACCTCCTCC ACCCCCTTCC CCTACTCTAG GGGAGAGAGC TGCTAGTGAG ATGACTGTTT 840 

TATAAAGAAA TGGAAAAAAG TGAAATAAAA AATATGTTGA ATCAGATTTr TTAAAAGGGG 900 

TATTTGTTTT TTTATAACAG GTATTGAAAC AAGTTAACTT GCATTCCTAT GTAAGATAGG 960 

AGGGGCTGAG GGGATCCCCA GTGTTTGGAA CATAAGTCAC TATGCAGACT AATAAACATC 1020 

AACTAGAGAG NAAAAAAAAA AAAAAAAAAA ATTTAAAAAA CT 1062 



15 



25 



55 



(2) INFORMATION FOR SEQ ID NO: 109: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2539 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

GAGAGACTCA CACTTCTTTT CCATTATCAC TGACGATGTA GTGGACATAG CAGQGGAAGA 60 

GCACCTACCT GTGTTGGTGA GGTTTGTTGA TGAATCTCAT AACCTAAGAG AGGAATTTAT 120 

30 AGGCTTCCTG CCTTATGAAG CCGATGCAGA AATTTTGGCT GTGAAATTTC ACACTATGAT 180 

AACTGAGAAG TGGGGATTAA ATATGGAGTA TTGTCGTGGC CAGGCTTACA TTGWCTCTAG 240 

TGGATTTTCT TCCAAAATGA AAGTTGTTGC TTCTAGACTT TYAAGMKMRA TWKCCCCMAK 300 

35 

YWAWCKGAAC AMAMKCTGSW CYTCCWSYGC SKTRRMKRYC GYKSTATRRC WARWKSAKYM 360 

CCYGKKMTGS RRGTAWYTSK TGCAYKAGGG AACAATTGAG GAAGTTTGTT CTTTTTTCCA 420 

40 TCGATCACCA CAACTGCTTT TAGAACTTGA CAACGTAATT TCTGTTCTTT TTCAGAACAG 4B0 

TAAAGAAAGG GGTAAAGAAC TGAAGGAAAT CTGCCATTCT CAGTGGACAG GCAGGCATGA 540 

TGCTTTTGAA ATTTTAGTGG AACTCCTGCA AGCACTTGTT TTATGTTTAG ATGGTATAAA 600 

45 

TAGTGACACA AATATTAGAT GGAATAACTA TATAGCTGGC CGAGCATTTG TACTCTGAGT 660 

GCAGTGTCAG ATTTTGATTT CATTGTTACT ATTGTTGTTC TTAAAAATGT CCTATCTTTT 720 

50 ACAAGAGCCT TTGGGAAAAA CYYCMAGGGG CAAACCTCTG ATGTCTTCTT TGCKKMMSRT 780 

APMTTTTGAY ATRMARYACT RMMTKSAYTY AAYGRWGTGA CWSGAWAATA TTRAASTYTA 840 

TACAATKAAT YWTRRYTTSM KRMAGMYAAT CCGAAAYTGT GGMAAMYAAA CTTGATATTC 900 

AAATGAAACT CCCTGGGAAA TTCCGCAGAG CTCACCAGGG TAACTTGGAA TCTCAGCTAA 960 

CCTCTGAGAG TTACTATAAA GAAACCCTAA GTGTCCCAAC AGTGGAGCAC ATTATTCAGG 1020 

60 AACTTAAAGA TATATTCTCA GAACAGCACC TCAAAGCTCT TAAATGCTTA TCTCTGGTAC 1080 
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CCTCAGTCAT GGGACAACTC AAATTCAATA CGTCGGAGGA ACACCATGCT GACATGTATA 114C 

GAAGTGACTT ACCCAATCCT GACACGCTGT CAGCTGAGCT TCATTGTTGG AGAATCAAAT 1200 

5 

GGAAACACAG GGGGAAAGAT ATAGAGCTTC CGTCCACCAT CTATGAAGCC CTCCACCTGC 1260 

CTGACATCAA GTTTTTTCCT AATGTGTATG CATTGCTGAA GGTCCTGTGT ATTCTTCCTG 1320 

10 TGATGAAGGT TGAGAATGAG CGGTATGAAA ATGGACGAAA GCGTCTTAAA GCATATTTGA 1380 

GGAACACTTT GACAGACCAA AGGTCAAGTA ACTTGGCTTT GCTTAACATA AATTTTGATA 1440 

TAAAACACGA CCTGGATTTA ATGGTGGACA CATATATTAA ACTCTATACR AKTAMGTCAG 1500 

15 

MGCTYYCTAC AKAYRAYTCM SWAWt-TTGTGG AAARYWSSTA MSMSWGCWKK TAMMRKTMCG 1560 

GMWWTYYYMK RKTYGAYMYW YGCGWMCGAG AAAAAGCCGT AAGGTGTATG TAGACCACTT 1620 

20 AATCACTAAA TATCTTTGCC TATAGGACTC CATTGAATAC ATTAGCCATT GATAATCTAC 1680 

CTGTTTAAAT GGCCCCTGTT TGAACTCTCA AGCTTTGAAG ACCTACCTGT TCTTCCAGAA 1740 

GAGAACGTTG AAAGTGCCAT GTTTCCTTTT GCGTGATCTC TGTTGATGGC ACTCTGGAAT 1800 

25 

TGTTTCCAGT TTAAKTCATT TTAGACATAG CATTTATTAT CACTGTGGAT CTCTACTTGT 1860 

TGGGTGTTAT GAATTCTTTG AAGAATATAT TTTGAAGAGG TGTGGGAGGA AGGAATACAT 1920 

30 TTTATAAAAT GTTGTAGTGA AGCCCACAAT TGACCTTKGA CTAATAGGAG TTTTAAGTAT 1980 

GTTAAAAATC TATACTGGAC AGTTACAAGA AATTACCGGA GAAAAGCTTG TGAGCTCACC 2040 

AAACAAGGAT TTCAGTGTAG ATTTTGTCTT TCTTGAACTT AAAGAAACAA ATGACAAAGT 2100 

35 

TTGAATGGAA AAGCCTGCTG TTGTTCCACA TCTCGTTGCT GTTTACATTC CTTTGTGGAG 2160 

CCTACATCTT CCTAAGCTTT TTAGCAGGTA TATGTTGAAC ACTTCTGTTT CATGGTTGAG 2220 

40 ACAGAATCAG AGGCCATGGA TACTGACAAC TGATTTGTCT GTTTTTTTTC TCTGTCTTTT 2280 

TCCATGACTC TTATATACTG CCTCATCTTG ATTTATAAGC AAAACCTGGA AAACCTACAA 2340 

AATAAGTGTT GTGGTTTATC TAGAAAAATA TGGAAAATAT TGCTGTTATT TTTGGTGAAG 2400 

45 

AAAATCAATT TTGTATAGTT TATTTCAATC TAAATAAAAT GTGAATTTTG TTWWATTAAA 2460 

AATTO3GSAC AAABTBGHGG GGGCTCCAAA CHTWVTCGHG KAMfTTCTCT WAARMATYTK 2520 

50 ATAAACMSCT TCACAATTC 2539 



55 (2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1751 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNTSS : double 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

AGCATGAAGC CGATGGCCGT GGTGGCCAGT ACCGTCCTGG GCCTGGTGCA AAACATGCGT 60 

GCGTTTGGCG GGATCCTGGT GGTGGTCTAC TACGTATTTG CCATCATTGG GATCAACTTG 120 

TTTAGAGGCG TCATTGTGGC TCTTCCTGGA AACAGCAGCC TGGCCCCTGC CAATGGCTCG 180 

GCGCCCTGTG GGAGCTTCGA GCAGCTGGAG TACTGGGCCA ACAACTTCGA TGACTTTGCG 240 

GCTGCCCTGG TCACTCTGTG GAACTTGATG GTGGTGAACA ACTGGCAGGT GTTTCTGGAT 300 

15 GCATATCGGC GCTACTCAGG CCCGTGGTCC AAGATCTATT TTGTATTGTG GTGGCTGGTG 360 

TCGTCTGTCA TCTGGGTCAA CCTGTTTCTG GCCCTGATTC TGGAGAACTT CCTTCACAAG 420 

TGGGACCCCC GCAGCCACCT GCAGCCCCTT GCTGGGACCC CAGAGGCCAC CTACCAGATG 480 

20 

ACTGTGGAGC TCCTGTTCAG GGATATTCTG GAGGAGCCCG GGGAGGATGA GCTCACAGAG 540 

AGGCTGAGCC AGCACCCGCA CCTGTGGCTG TGCAGGTGAC GTCCGGGCTG CCATCCCAGC 600 

25 AGGGGCGGCA GGAGAGAGAG GCTGGCCTAA CACAGGTGCC CATCATGGAA GAGGCGGCCA 660 

TGCTGTGGCC AGCCAGGCAG GAAGAGACCT TTCCTCTGAC GGACCACTAA GCTGGGGACA 720 

GGAACCAAGT CCTTTGCGTG TGGCCCAACA ACCATCTACA GAACAGCTGC TGGTGCTTCA 780 

30 

GGGAGGCGCC GTGCCCTCCG CTTTCTTTTA TAGCTGCTTC AGTGAGAATT CCCTCGTCGA 840 

CTCCACAGGG ACCTTTCAGA CAAAAATGCA AGAAGCAGCG GCCTCCCCTG TCCCCTGCAG 900 

35 CTTCGGTGGT GCCTTTGCTG CCGGCAGCCC TTGGGGACCA CAGGCCTGAC CAGGGCCTGC 960 

ACAGGTTAAC CGTGAGTCTG TCTCATCTAT TCACAGCTGG GAATGATACT AATACCTCCG 1020 

ATTTTAGCCC AGCACCACAG GGTACGTTCC AGTTTTTCTC TCTTTCCATA GCTGTAAGGC 1080 

40 

CCTTTCTGGG AATGGTTCTC ATTCTCCTTA ATCTATTATT GGGTCAGTTT TCCTGCATGT 1140 

CCCCAGCCTC CCATCACTGC CACCCACTCC CCACAGAGAT GCCCTGCTCA TCCGACTGGG 1200 

45 GCTTTGACTC CCACACTGTG TACCCCTCTT GTGTGGACGC CCTGCTGCCA AAACCTTCAG 1260 

CAAACAGCTT TCCAAATGGA AGTTGTCACT GTCAGGCCTT TACAATCAGC AACAGCAAAA 1320 

TCTACATGCT GCTGAGGGTC CTGCCTCATT AAGATGCAAT AAATATGTAA GTACATAAAA 1380 

50 

ACAGCAATAG AAGAAACGTA ATGCTTTATT CTCAAATATG ATGTCTACAT AGAAAAGCCA 1440 

AAATTATTAA GAATAGTAAG AATTCACCCA GCACTTTGGG AGGCCGAGGC GGGTGGATCA 1500 

55 TGAGGTCAGG AGATCGAGAC CATCCTGGCT AACAGGGTGA AACCCCGTCT CTACTAAAAA 1560 

TACAAAAAAT TGGCCGGGCG CAGTGGCGGG CGCCTGTGGT CCCAGCTACT GGGGAGGCTG 1620 

AGGCAGGAGA ATGGCGTGAA CCCGGGAAGC GGAGCTTGCA GTGAGCCGAG ATTGCGCCAC 1680 

60 
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TGCAGTCCGC AGTCCAGCCT GGGCGACAGA GCGAGACTCC GTCTCAAAAA AAAAAAAAAA 



1740 



AAAAAAAAAA A 



1751 



10 



15 



20 



25 



30 



35 



40 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1117 base pairs 

(B) TYPE: nucleic acid 

(C) STOANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

AATGTTGTGG TGGTAGCATT TGGGTTAATT CTRATTATAG AGTCTCTTGG AGAGCAATGT 60 

CCATAAACTA ATCCCAAACA ACATTGTCTT TTTRATGTTG TAGTGAACAG CAGAGAATTT 120 

CAAAGGACCT TGCTAATATC TGTAAGACGG CAGCTACAGC AGGCATCATT GGCTGGGTGT 180 

ATGGGGGAAT ACCAGCTTTT ATTCATGCTA AACAACAATA CATTGAGCAG AGCCAGGCAG 240 

AAATTTATCA TAACCGGTTT GATGCTGTGC AATCTGCACA TCGTGCTGCC ACACGAGGCT 300 

TCATTCGTTA TGGCTGGCGC TGGGGTTGGA GAACTGCAGT GTTTGTGACT ATATTCAACA 360 

CAGTGAACAC TAGTCTGAAT GTATACCGAA ATAAAGATGC CTTAAGCCAT TTTGTAATTG 420 

CAGGAGCTGT CACGGGAAGT CTTTTTAGGA TAAACGTAGG CCTGCGTGGC CTGGTGGCTG 480 

GTGGCATAAT TGGAGCCTTG CTGGGCACTC CTGTAGGAGG CCTGCTGATG GCATTTCAGA 540 

AGTACTCTGG TGAGACTGTT CAGGAAAGAA AACAGAAGGA TCGAAAGGCA CTCCATGAGC 600 

TAAAACTGGA AGAGTGGAAA GGCAGACTAC AAGTTACTGA GCACCTCCCT GAGAAAATTG 660 

AAAGTAGTTT ACAGGAAGAT GAACCTGAGA ATGATGCTAA GAAAATTGAA GCACTGCTAA 720 

ACCTTCCTAG AAACCCTTCA GTAATAGATA AACAAGACAA GGACTGAAAG TGCTCTGAAC 780 

TTGAAACTCA CTGGAGAGCT GAAGGGAGCT GCCATGTCCG ATGAATGCCA ACAGACAGGC 840 

CACTCTTTGG TCAGCCTGCT GACAAATTTA AGTGCTGGTA CCTGTGGTGG CAGTGGCTTG 900 

CTCTTGTCTT TTTCTTTTCT TTTTAACTAA GAATGGGGCT GTTGTACTCT CACTTTACTT 960 

ATCCTTAAAT TTAAATACAT ACTTATGTTT GTATTAATCT ATCAATATAT GCATACATGA 1020 

ATATATCCAC CCACCTAGAT TTTAAGCAGT AAATAAAACA TTTCGCAAAA GATTAAAGTT 1080 

GAATTTTACA GTTAAAAAAA AAAAAAAAAA AAAAAAA 1117 



60 



(2) INFORMATION FOR SEQ ID NO: 112: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1313 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
5 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

GGCAGAGGTT TTCTTATATT TTAAGTAAAT TTAAAGTGGC TATCAGAATA TTTATTCTTG 60 

10 

TTTGAGACTA CCAACATAAC TACGTGTTGA AGGTGCTTCA CAGAGAATAT ATTGCCTTTA 120 

ATGTGAAATA ATTTTCACCA ATGTTGCTAA CTTTAATAAA GTATAAAATT TGTAGAATAT 180 

15 TCAGTTAAGT AGTTGGTAAC CCTTTTCTAT TTTAGTAAAA CTTAATGCAT GTTTACTTTT 240 

TTTTGAAAGA TGCAGACAAT CTCTTTGAAC ATGAATTGGG GGCTCTCAAT ATGGCTGCAT 300 

TACTACGAAA AGAAGAAAGA GCAAGTCTTC TTAGTAATCT TGGCCCATGT TGTAAGGCGT 360 

20 

TGTGCTTCAG ACGGGATTCT GCAATTCGAA AGCAGCTTGT TAAAAATGAG AAGGGCACCA 420 

TAAAACAAGC TTACACGAGT GCTCCAATGG TAGACAATGA ATTACTTCGA TTGAGTCTTC 480 

25 GGTTATTTAA GCGGAAGACT ACTTGCCATG CTCCAGGACA TGAAAAGACT GAAGATAATA 540 

AACTTTCACA GTCCAGTATC CAACAGGAAC TGTGTGTGTC TTAAGACCGA AGTTACAATA 600 

TGGTATTTTT GGTACTGTCT TCCTTCAGCA GTGCATATTC TTTTGCAAAG TTCITTGGTT 660 

30 

TGACAAGCAT TAGTGACAAA GGCAGAAAAG ATTTATCAGC CATGCTAAAA GAGTGAAGAA 720 

TTTTGATCTT TAGAGACACT AGTTTTGGCC AACTTAAGAT TTTACGTTAA TTTTTACATA 780 

35 GTATTTGACA CTCATGCAAA ATAATGTGAA AACATCTAGA TTTAGTAGTT TATTCTGCGC 840 

CTTTTGTTAA AACTGAAGAT TTTGGAAAAT GGTTGTCACT GCTCTTCCAG CCTATGAATA 900 

TTTTTGTGAA ATGGAACCAT GGATTTATGT CTGGATCATC CATACAGAAC CAACAATTTT 960 

40 

ATTCAAAAAC AATGTGTTCA TCAAAGTAAT TGCTCACATT GTGCAGTACT ATGTTGTACA 1020 

GACCACGTGA AAGGGAATGC TGGTCTAGCT GGCGTGGTAT GTTTATAGGC GAATTTCAGC 1080 

45 AGAAGGAAGC CAAAATAGTT TTTTCCTTTT GAAAGTTTTT TAAAAA1TAT TTCATGGGTC 1140 

TTTTTTTTAA TTAATATGTG TGCATTGTTA CAATGTATGT TGGGATGTCT TTTGACCCTA 1200 

AATGCTTTTT TTGTTATCAG AGATTGTGTA CTATTTTTAT TTTTAATAAA TGTATCTTCC 1260 

50 

CTTTTMAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 1313 



55 

(2) INFORMATION FOR SEQ ID NO: 113: 



60 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1654 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEENESS: double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

5 

ACAGGGACAG AATACTCTCT TTCCTTCCTT CAAGTACAAG AAGGCTTTCT CTACCATTTG 60 

CGTCTACACT TTATTTTAAA AGCTATCCTT TTCTAGTAGT ATTTTATCAT GGCAATGGCA 120 

10 TGATGACAAC AACAGTCTTT CATTACAGAC TGAAGGGAAG CATGTCCTTA CTTAAAATAG 180 

TTCTGCTACT TTCCCTCCTA TTATAAGGAA ATTTTACAGA TTCTAAAAAT ACCTTAATTT 240 

TTCTTTGATT TTTATTTTAC CAAGTCACAA ATGTCTTTTT GATGTTTTGA GAATTGTTCT 300 

15 

CATAGAATCA CAAATACTGA CATTTCATTA GATGATTATT TTCCTAGAAT CCCCAAAGAG 360 

CAGTGGCAGT CCATGGCTTG GTTGAAGCTA GAAATTTTCC TGCCCCTGGT GACCTGGTAA 420 

20 GCCTCCTGCT CGGAACCGTG TGAGTGGGTG AGGAAGATGA GAGATGGTCA GATGGAAGAG 480 

AGRAATACAT GAACTGCTCT GGCCTCTCTG GTTCTGTTCT TGGCCCAGAG TTTTTGAAAA 540 

GCAGCGGANA TNGACTGACT TCACATGCTC AGCTTTCTCA GCCTTTTGTT TATTTTGTTG 600 

25 

TCCTTAGATT TCCCTGTTGT AAAAGGGGCA AGAAAAGTAA CTCATCATCT CTAACACACC 660 

ATGGCAGCTT AGCCAGGTAG TCTTAGTGGT GGTGTTTAGG CATAAGATAT GCTGATCATC 720 

30 AGTCTCAGGC CACAGTTTCC TTCACTAATC GTCCAGCTTG AGTGTTCTGT TCTCTTCCTG 780 

CCCATTTCCT TGAACCTCCT GCTCTAGCCT TGGCGGAGGG AGAGTGCTAT TTCCTTTTGT 840 

TCTCCCTCTG TCTTAGGAAA AGCCATCTTT AATATAGTTC TTCACCACTG TTGGGGTTGT 900 

35 

TTTGTGATTT TTTTTTTCTT CCGAAGAACT CCTGGTTGTT ATTGGATTTT GTATTTTAAT 960 

ACAAATTATT GAATTTTATA AGC1TGTACA CAATATTTAA TTAGTGTGAA AGGAAACAAA 1020 

40 GAATGCAGGA AAAATAATTT AATATCAACC TCAGTTGACA AGGTGCTCAG ATTATTCAAT 1080 

TCGGGATCCT CCTTTTGTTA GGTTTTTGAG ACAACCCTAG ACCTAAACTG TGTCACAGAC 1140 

TTCTGAATGT TTAGGCAGTG CTAGTAATTT CCTCGTAATG ATTCTGTTAT TACTTTCCTA 1200 

45 

TTCTTTATTC CTCTTTCTTC TGAAGATTAA TGAAGTTGAA AATTGAGGTG GATAAATACA 1260 

AAAAGGTAGT GTGATAGTAT AAGTATCTAA GTGCAGATGA AAGTGTGTTA TATACATCCA 1320 

50 TTCAAAATTA TGCAAGTTAG TAATTACTCA GGGTTAACTA AATTACTTTA ATATGCTGTT 1380 

GAAYCTACTC TGTTCCTTGG CTAGAAAAAA TTATAAACAG GACTTTGTAG TTTGGGAAGC 1440 

CAAATTGATA ATATTCTATG TTCTAAAAGT TGGGCTATAC ATAAATTATT AAGAAATATG 1500 

55 

GATTTTTATT CCCAGGATAT GGTGTTCATT TTATGATATT ACGCAGGATG ATGTATTGAG 1560 

TAAAATCAGT TTTGTAAATA TGTAAATATG TCATAAATAA ACAATGCTTT GACTTATTTC 1620 

60 CAAAAAAAAA AAAAAATAAA NTTCGAGGGG GGGC 1654 
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5 (2) INFORMATION FOR SEQ ID NO: 114; 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1171 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 
15 GGCAAACTTT CCCCCAANGC TTCGAAACTT GCAAGCCGAA ACCTTGAATC GTTAAAAGTT 60 



GGGTTGCGNC GGCGCCCTGG CCCGAAGAAG CGCAATTGGC GTTCCGCGAA CGTTGGCCCT 120 



CAACGGCTCG GCAGCCAGCC ATGTCCTGCA CCCAGGACAG CGGCCCTGGG CTACAAGGAC 180 

20 

CTGGMCCTCA TCTTCCTGCG CCGACCTGCG CGGGGTAAGG GGWAGTTTCA GACTGTGAAG 240 



GACGTCGTGC TGGACTGCCT GTTGGACTTC TTACCCGAGG GGGTGAACAA AGAGAAGATC 300 
25 ACACCACTCA CGCTCAAGGA AGCTTATGTG CAGAAAATGG TTAAAGTGTG CAATGACTCT 360 



GACCGATGGA GTCTTATATC CCTGTCAAAC AACAGTGGCA AAAATGTGGA ACTGAAATTT 420 



GTGGATTCCC TCCGGAGGCA GTTTGAATTC AGTGTAGATT CTTTTCAAAT CAAATTAGAC 480 

30 

TCTCTTCTGC TCTTTTATGA ATGTTCAGAG AACCCAATGA CTGAGACATT TCACCCCACA 540 



ATAATCGGGG AGAGCGTCTA TGGCGATTTC CAGGAAGCCT TTGATCACCT TTGTAACAAG 600 
35 ATCATTGCCA CCAGGAACCC AGAGGAAATC CGAGGGGGAG GCCTGCTTAA GTACTGCAAC 660 
CTCTTGGTGA GGGGCTTTAG GCCCGCCTCT GATGAAATCA AGACCCTTCA AAGGTATATG 720 



TGTTCCAGGT TTTTCATCGA CTTCTCAGAC ATTGGAGAGC AGCAGAGAAA ACTGGAGTCC 780 

40 

TATTTGCAGA ACCACTTTGT GGGAATTGGA AGACCGCAAG TATGAGTATC TCATGACCCT 840 



TCATGGAGTG GTAAATGAGA GCACAGTGTG CCTGATGGGA CATGAAAGAA GACAGACTTT 900 
45 AAACCTTATC ACCATGCTGG CTATCCGGGT GTTAGCTGAC CAAAATGTCA TTCCTAATGT 960 



GGCTAATGTC ACTTGCTATT ACCAGCCAGC CCCCTATGTA GCAGATGCCA ACTTTAGCAA 1020 



TTACTACATT GCACAGGTTC AGCCAGTATT CACGTGCCAG CAACAGACCT ACTCCACTTG 1080 

50 

GCTACCCTGC AATTAAGAAT CATTTAAAAA TGTCCTGTGG GGAAGCCATT TCAGACAAGA 114C 
CAGGAGAGAA AAAAAAAAAA AAAAAAAAAA A 1171 

55 

(2) INFORMATION FOR SEQ ID NO: 115: 
60 {i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 842 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO r 115: 

GGTCTGCGCC GGAAGTGCAT GAGCTGCCGA TGTGGTGCTT AGTGATTGCG GTTTCGGTCG 60 

10 CTCTCCCGTG TTTCCCGGGC TGGGTATTTG CCTCGCACCA TGGCGCCCAA GGGCAAAGTG 120 

GGCACGAGAG GGAAGAAGCA GATATTTGAA GAGAACAGAG AGACTCTGAA GTTCTACCTG 180 

CGGATCATAC TGGGGGCCAA TGCCATTTAC TGCCTTGTGA CGTTGGTCTT CTTTTACTCA 240 

TCTGCCTCAT TTTGGGCCTG GTTGGCCCTG GGCTTTAGTC TGGCAGTGTA TGGGGCCAGC 300 

TACCACICTA TGAGCTCGAT GGCACGAGCA GCGTTCTCTG AGGATGGGGC CCTGATGGAT 360 

20 GGTGGCATGG ACCTCAACAT GGAGCAGGGC ATGGCAGAGC ACCTTAAGGA TGTGATCCTA 420 

CTGACAGCCA TCGTGCAGGT GCTCAGCTGC TTCTCTCTCT ATGTCTGGTC CTTCTGGCTT 480 

CTGGCTCCAG GCCGGGCCCT TTACCTCCTG TGGGTGAATG TGCTGGGCCC CTGGTTCACT 540 

GCAGACAGTG GCACCCCAGC ACCAGAGCAC AATGAGAAAC GGCAGCGCCG ACAGGAGCGG 600 

CGGCAGATGA AGCGGTTATA GCCATTGACA TTGTGGCCAC AGGCCACTGG CCCTGGGTGG 660 

30 CTCTGTCAGG GTGCACAGCC CCTCATGCCT GGAGCAATGA GGGTCTAGTC CAGGGGCCAA 720 

AAGCAGTCTG AGGTATTGGG TATACTTATA CTCTATAGGG TCGTTGAATA AATGGCTTAG 780 

AATGTGAAAA AAAAAAAAAA AAAAAACTCG AGGGGGGCCC GGTACCCAAT TTCNCCTANA 840 



25 



35 



AT 



842 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 116; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1640 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

GGCACGAGGC GGCGGCAGCG GTGGCGGCGG CGCCCCCCGG CGGGAGCCGT TCCCTTTCCC 60 

GTCGGGGAGC GCGGGGYCGG GGCCCAGGGG ACCCCGGGCC ACGGAGAGCG GGAAGAGGAT 120 

55 GGATTGCCCG GCCCTCCCCC CCGGATGGAA GAAGGAGGAA GTGATCCGAA AATCTGGGCT 180 

AAGTGCTGGC AAGAGCGATG TCTACTACTT CAGTCCAAGT GGTAAGAAGT TCAGAAGCAA 240 

GCCTCAGTTG GCAAGGTACC TGGGAAATAC TGTTGATCTC AGCAGTTTTG ACTTCAGAAC 300 

60 
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TGGAAAGATG ATGCCTAGTA AATTACAGAA GAACAAACAG AGACTGCGAA ACGATCCTCT 360 

CAATCAAAAT AAGGGTAAAC CAGACTTGAA ATACAACATT GCCAATTAGA CAAACAGCAT 420 

CAATTTTCAA ACAACCGGTA ACCCAAAGTC ACAAATCATC CTAGTAATAA AGTGAAATCA 480 

GACCCACAAC GAATGAATGA ACAGCCACGT CAGCTTTTCT GGGAGAAGAG GCTACAAGGA 540 

CTTTAGTGCA TCAGATGTAA CAGAACAAAT TATAAAAACC ATGGAACTAC CCAAAGGTCT 600 

TCAAGGAGTT GGTCCAGTAG CAATGATGAG ACCCTTTTAT CTGCTGTTGC CAGTGCTTTG 660 

CACACAAGCT CTGCGCCAAT CACAGGGCAA GTCTCCGCTG CTGTGGAAAA GAACCTGCTG 720 

15 TTTGGCTTAA CACATCTCAA CCCCTCTGCA AAGCTTTTAT TGTCACAGAT GAAGACTCAG 780 

GAAACAGAAG AGCGAGTACA GCAAGTACGC AAGAAATTGG AAGAAGCACT GATGGCAGAC 840 

ATCTTGTCGC GAGCTGCTGA TACAGAAGAG ATGGATATTG AAATGGACAG TGGAGATGAA 900 

20 

GCCTAAGAAT ATGATCAGGT AACTTTCGAC CGACTTTCCC CAAGAGAAAA TTCCTAGGAA 960 

ATTGAACAAA AATGTTTCCA CTGGCTTTTG CCTGTAAGAA AAAAAATGTA CCCGAGCACA 1020 

25 TAGAGCTTTT TAATAGCACT AACCAATGCC TTTTTAGATG TATTTTTGAT GTATATATCT 1080 

ATTATTCAAA AAATCATGTT TATTTTGAGT CCTAGGACTT AAAATTAGTC TTTTGTAATA 1140 

TCAAGCAGGA CCCTAAGATG AAGCTGAGCT TTTGATGCCA GGTGCAATCT ACTGGAAATG 1200 

30 

TAGCACTTAC GTAAAACATT TGTTTCCCCC ACAGTTTTAA TAAGAACAGA TCAGGAATTC 1260 

TAAATAAATT TCCCAGTTAA AGATTATTGT GACTTCACTG TATATAAACA TATTTTTATA 1320 

35 CTTTATTGAA AGGGGACACC TGTACA1TCT TCCATCGTCA CTGTAAAGAC AAATAAATGA 1380 

TTATATTCCA CAGAAAAAAA AAAAAAAAAW MWSTYGARRR GSRGCMCRSW AYMMARWWCC 1440 

CCWMRTWRGS MKTCSTMTKA YTTACATTCA ACTCTGATCC CGGGGCCTTA GGTTTGACAT 1500 

40 

GGGAGGTGGG AGGAAGATAG CGCATATATT TGCAGTATGA ACTATTGCCT CTGGGACGTT 1560 

GTGAGGAATT GTGCTTTCAC CAGAATTTCT AAGGATTTCT GGCTTAAATA TCACCTAGCC 1620 

45 TGTGGTAATT TTTTTTCCCT 1640 



50 (2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 952 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 



60 



TGAATTTAGN AAACACTTTG GAAAACTCAT AACCTCATCA GAAACTGCCT TTAGCCACAC 



60 
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TCCTGACCTT CTAGATGAGT AACAAAAAAA TGAAATAAGT TCTTGGAAAT TAAGCCATTT 120 

ATTTTAATTT GCTATTTTTT TCAATGTTCT AGGTATCTTT AAATTTGTTA TTGTGGAATC 180 

5 

ATTTTCCTGC CAGATACCTT TATCAAAATT ATTGGCCTCA TGAGAGCTGA AGTAAGTCAG 240 

CTTTTTGGTG AACTTTAGTG GACTTCTGTG AGATTGTAGT TGTACTTTGT ATCTCTAAAT 300 

10 CTAAAGATAG TTTTTTAAAA CTCCCAAAGA AAATCTGCTC TCCTTTCTGA TCTAAAAACT 360 

CATCTTTGGG GTAAAGAGTT AAGTGTCCAA AGGTTGTCAC AGTTCATGAG GTCAGAGGGA 420 

GCTAGCCTGG CACCTGGACT CTGCCCATCC ACAGGTGACA GATTCCAACA GAAGTGTATT 480 

15 

TAAATTCTCC AGTAGACAAT GCTGGGTAAG GGAGGGGGTA GGGCTGGGTT ATTAAGATAC 540 

AGGCTGCTGT ATTTTACATT GGTTGTGGGG GAAGGGGAGC CTGGAGAAAA CAAAGTCACT 600 

20 ATTCCCTTTT TTGAAACAGG AAAAAAAATT ATTTTTTGTT CAGTAAAAAT GGTAGAGAAT 660 

TCCAATGTCC CTAGCCACAA GGGACCAGTT CCACTGAGAA GTGAACAGTG GGAACTCAAA 720 

ATTTCAGAAA CATTGGGGGA AGGGAAAATT GGCTTTCTCT TAATTGGCAG ATGTTCCAGT 780 

25 

GGGGSGGGGG GGCTCTGTTT TTGTTGGGAT GTGTTATGTT GTATGTACGC ATATATGGAC 840 

CGGAGTCTGC TGAGTTTATA AGGTTCCAAA AATATGGTAA AATCTTGGTT TTTGTTAATT 900 

30 TATCTCAATA AAAGCCCACT GGRACTCCAA AAAAAAAAAA AAAAAAAAGA NN 952 



35 (2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1256 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

45 GACGTCATAG GTAAACAGGC TCTGTATCCG TGGCAGCGGC CGTGGCAGGC TGGCTGGGTA 60 

CCGGCTGTCG CTGACCCAGG AGAAGCTGCC TGTCTACATC AGCCTGGGCT GCAGCGCGCT 120 

GCCGCCGCGG GGCCGGCAGC TGAACTATGT GCTCTTCAGG GCGGGCACCG TGTTGCATTC 180 

50 

ATCTTTGTAC CCCCAGCATC TAGCAGTGTT GGCATGTAGT AGGCACTCAA GAAATGTGTG 240 

TTGAATGAAC GATGCCTGTG ACAAGCAAGC GGACTTTATT CTTTCCTGAC CCTTGCTCCT 300 

55 ATGACACACC TCCTCCTGAC TGCCACTGTC ACTCCTTCAG AGCAGAACTC CTCTAGGGAA 360 

CCTGGATGGG AAACAGCCAT GGCCAAGGAC ATCCTGGGTG AAGCAGGGCT ACACTTTGAT 420 

GAACTGAACA AGCTGAGGGT GTTGGACCCA GAGGTTACCC AGCAGACCAT AGAGCTGAAG 480 

60 
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GAAGAGTGCA AAGACTTTGT GGACAAAATT GGCCAGTTTC AGAAAATAGT TGGTGGTTTA 540 

ATTGAGCTTG TTGATCAACT TGCAAAAGAA GCAGAAAATG AAAAGATGAA GGCCATCGGT 600 

GCTCQGAACT TGCTCAAATC TATAGCAAAG CAGAGAGAAG CTCAACAGCA GCAACTTCAA 660 

GCCCTAATAG CAGAAAAGAA AATGCAGCTA GAAAGGTATC GGGTTGAATA TGAAGCTTTG 720 

TGTAAAGTAG AAGCAGAACA AAATGAATTT ATTGACCAAT TTATTTTTCA GAAATGAACT 780 

GAAAATTTCG CTTTTATAGT AGGAAGGCAA AACAAAAAAA AGCCTCTCAA AACCAAAAAA 840 

ACCTCTGTAG CATTCCAGCG GCTTGACCAA TGACCTATGT CACAAGAGGT GGCGTGTAAG 900 

GAATGCAGCC CCCTGAAGAC AGCACTACAA GTCTGGGGGA GCCAGTTTTA ACATCAGTGC 960 

ACAGCTGCTG CTGGTGGCCC TGCAGTGTAC GTTCTCACCT CTTATGCTTA GTTGGAACTA 1020 

AGCAGTTTGT AAACTTTCAT CCTTTTTTTT GTAAATTCAC AAAGCTTTGG AAGGAGAAGC 1080 

I 

AATAAATTTT TGTTTTCAAA TGGCTTGATG TACCTTTTTT CCTGTTGCTC TTGAAATATG 1140 

TTTAACTCCT CATGAGAGAA CCCTGGATTC TCTATCCCCT AGTCCACAAA ACAAACCAGG 1200 

! CAGTGGTCAG CAGCTACCTT TNATTTGGAT CACACACGTG AGTCAGACAG TACCAC 1256 



(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1143 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

GGCCGTAGCA GCCGGGCTGG TCCTGCTGCG AGCCGGCGGC CCGGAGTGGG GCGGCGGCAT 60 

GTACCTTCCA CATTGAGTAT TCAGAAAGAA GTGATCTGAA CTCTGACCAT TCTTTATGGA 120 

TACATTAAGT CAAATATAAG AGTCTGACTA CTTGACACAC TGGCTCGAGC AAACATGAAC 180 

GTTGGAGTTG CCCACAGTGA AGTGAATCCA AATACCCGTG TCATGAACAG CCGGGGTATG 240 

TGGCTGACAT ATGCATTGGG AGTTGGCTTG CTTCATATTG TCTTACTCAG CATTCCCTTC 300 

TTCAGTGTTC CTGTTGCTTG GACTTTAACA AATATTATAC ATAATCTGGG GATGTACGTA 360 

TTTTTGCATG CAGTGAAAGG AACACCTTTC GAAACTCCTG ACCAGGGTAA AGCAAGGCTC 420 

CTAACTCATT GGGAACAACT GGACTATGGA GTACAGTTTA CATCTTCACG GAAGTTTTTC 480 

ACAATTTCTC CAATAATTCT ATATTTTCTG GCAAGTTTCT ATACGAAGTA TGATCCAACT 540 

CACTTCATCC TAAACACAGC TTCTCTCCTG AGTGTACTAA TTCCCAAAAT GCCACAACTA 600 

CATGGTGTTC GGATCTTTGG AATTAATAAG TATTGAAATG TTTTGAAACT GAAAAAAAAT 660 
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TTTACAGCTA CTGAATTTCT TATAAGGAAG GAGTGGTTAG TAAACTGCAC TGTTTCTSTG 720 

ATAATGTGAA ATGAGAAGTA TTTACATTGG AGGGCCAATG GCTGGTCCTT CAAGTGCTGT 780 

5 

TTTGAAGTGC AGATTTCCAT TAAATGATGC CTCTG1TTAA TACACCTGGT ACATTTCTGA 840 

AGAGGGGCTT TATAAGCAGG CTGGGCAGGC CCAGCTTATA AGTTAAAGGG CATCACAGTG 900 

10 AGGGTGTAGT AGATAAATTC AAGGAAATAA GAGATTTGTA AGAAACTAGG ACCAGCTTAA 960 

CTTATAATGA ATGGGCATTG TGTTAAGAAA AGAACATTTC CAGTCATTCA GCTGTGGTTA 1020 

TTTAAAGCAG ACTTACATGT AAACCGGAAT CCTCTCTATA CAAGTTTATT AAAGATTATT 1080 

15 

TTTATTACCG TAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAANA 1140 

GAN 1143 

20 



(2) INFORMATION FOR SEQ ID NO: 120: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1782 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 
CAGGCCCCGG CCCCCCACCC ACGTCTGCGT TGCTGCCCCG CCTGGGCCRG GCCCCAAAGG 60 
35 CAAGGACAAA GCAGCTGTCA GGGAACCTCC GCCGGAGTCG AATTTACGTG CAGCTGCCGG 120 
CAACCACAGG TTCCAAGATG GTTTGCGGGG GCTTCGCGTG TTCCAAGAAC TGCCTGTGCG 180 
CCCTCAACCT GCTTTACACC TTGGTTAGTC TGCTGCTAAT TGGAATTGCT GCGTGGGGCA 240 

40 

TTGGCTTCGG GCTGATTTCG AGTCTCCGAG TGGTCGGCGT GGTCATTGCA GTGGGCATCT 300 
TCTTGTTCCT GATTGCTTTA GTGGGTCTGA TTGGAGCTGT AAAACATCAT CAGGTGTTGC 360 
45 TATTYTTTTA TATGATTATT CTGTTACTTG TATTTATTGT TCAGTTTTCT GTATCTTGCG 420 
CTTGTTTAGC CCTGAACCAG GAGCAACAGG GTCAGCTTCT GGAGGTTGGT TGGAACAATA 480 
CGGCAAGTGC TCGAAATGAC ATCCAGAGAA ATCTAAACTG CTGTGGGTTC CGAAGTGTTA 540 

50 

ACCCAAATGA CACCTGTCTG GCTAGCTGTG TTAAAAGTGA CCACTCGTGC TCGCCATGTG 600 
CTCCAATCAT AGGAGAATAT GCTGGAGAGG TTTTGAGATT TGTTGGTGGC ATTGGCCTGT 660 
55 TCTTCAGTTT TACAGAGATC CTGGGTGTTT GGCTGACCTA CAGATACAGG AACCAGAAAG 720 
ACCCCCGCGC RAATCCTAGT GCATTCCTTT GATGAGAAAA CAAGGAAGAT TTCCTTTCGT 780 
ATTATGATCT TGTTCACTTT CTGTAATTTT CTGTTAAGCT CCATTTGCCA GTTTAAGGAA 840 

60 
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GGAAACACTA TCTGGAAAAG TACCTTATTG 
TTCTCTACAT GTTTTTTTCT TTCCGTTGCT 
5 GCTCGGTGGC ACCTGGGAAT TTACTGTATT 
CTTAGCATTT TTACCTGCAG AAAAACTTTG 
ATCTGAACGT ACATCTCACT GGTATAATTA 

10 

TACTGGAAAA AGAGTGGRAA TTTATTAAAA 
GGAAATCCAA ATTCCCAATT TTTTTTGGTC 
15 TGTTAGTATA AAAATGATAA TTWACTKGTA 
AAATAGTTAT GYCYTAGGAA ATTGTGGTTT 
GAGAAGTGGT TTCATGAAAT GTTCTAATGT 

20 

AATGGAACGA GTTTTGAGTA ATCAGGAAGT 
ATAATTTGAA GTCTAAAAGA CTGCATTTTT 
25 GTAGCAAAAA GATATTTGAT TATCTTAAAA 
CAGTATTGTA ACAGCAACTT GTYAAACCTA 
GAAATTGAAA TCGTATTGTG TGGCTCTGTA 

30 

CTTTCTTTGT GTATGCATGT TTGAATTAAA 



ATAGTGGAAT TATATATTTT TACTCTATGT 900 

GAAAAATATT TGAAACTTGT GGTCTCTGAA 960 

CATTGTCGGG CACTGTCCAC TGTGGCCTTT 1020 

TATGGTACCA CTGTGTTGGT TATATGGTGA 1080 

TATGTAGCAC TGTGCTGTGT AGATAGTTCC 1140 

TCAGAAAGTA TGAGATCCTG TTATGTTAAG 1200 

TTTTTAGGAA AGATGTGTTG TGGTAAAAAG 1260 

GTCTTTTATG ATWACACCAA TGTATTCTAG 1320 

AATTTTTGAC TTTTACAGGT AAGTGCAAAG 1380 

ATAATAACAT TTACCTTCAG CCTCCATCAG 1440 

ATATCTATAT GATCTTGATA TTGTTTTATA 1500 

AAACAAGTTA GTATTAATGC GTTGGCCCAC 1560 

ATTGTTAAAT ACCGTTTTCA TGAAAGTTCT 1620 

AGCATATTTG AATATGATCT C C CAT AATTT 1680 

TATTCTGTTA AAAAATTAAA GGACAGAAAC 1740 

AGAAAGTAAT GG 1782 



35 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 121: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 610 base pairs 
40 <B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

GTTGGCTGCA GATTTGTGGT GCGTTCTGAG CCGTCTGTCC TGCGCCAAGA TGCTTCAAAG 60 

TATTATTAAA AACATATGGA TCCCCATGAA GCCCTACTAC ACCAAAGTTT ACCAGGAGAT 120 

50 TTGGATAGGA ATGGGGCTGA TGGGCTTCAT CGTTTATAAA ATCCGGGCTG CTGATAAAAG 180 

AAGTAAGGCT TTGAAAGCTT CAGCGCCTGC TCCTGGTCAT CACAACCAGA TTTACTTGGA 240 

GTACATGTGA AAGAAAACGT CAGTCTGCCT GTAAATTTCA GCAAGCCGTG TTAGATGGGG 300 

AGCGTGGAAC GTCACTGTAC ACTTGTATAA GTACCGTTTA CTTCATGGCA TGAATAAATG 360 

GATCTGTGAG ATGCACTGCT ACCTGGTACT GCTTTCAGTG TGTTCCCCCT CAGCCCTCCG 420 

60 GCGTGTCAGG CATACTCTGA GTAGATAATT TGTCATGCAG CGCATGCAAT CAGAATCTCA 480 
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CTGAGCCACC CATCATTGTG AAATAATTAC CTCAGTTGTA CAGGACTTGG TGATCAGGAT 540 
CCAGGCACTC ACTTGTATTC TACTGCTCAA TAAACGTTTA TTAAACTTGA AAAAAAAAAA 600 
AAAAAAAAAA 610 



10 

(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 526 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

20 

GGTACGCCTG CAGGTACCGG TCCGGAATTC CGGGTCGCCC ACGCGTCNGG CCACGCGTCC 60 

ACCCACGCGT CCGSCCACGC GTCGGAGCCG AGCCGGACTG GTCAGGATGA TCACGGACGT 120 

25 GCAGCTCGCC ATCTTCGCCA ACATGCTGGG CGTGTCGCTC TTCTTGCTTG TCGTTCTCTA 180 

TCACTACGTG GCCGTCAACA ATCCCAAGAA GCAGGAATGA AAGTGGCGCT TTCTCCGCCC 240 

CAGGGTTCCA GGACATAGTC TGAGGCAAGA TGGAGGGTAT GAGGGGCCTT CACkCTTCAC 300 

30 

TTCATCCCTT CTACCCATCA CAACATACAA AGCAACTACA CCTGGATTTT TCCAAACAAC 360 

TTTTATTTCC TCAGAGTCTT CCTTAATCCT ATGGAACAAG AAGCTGCCAC TGAATAGGGC 420 

35 CCAGTATAGG GGCTTGCTTT TCTACTCCCT CCCCCCAATA TAAAAATATA GACTPTTTAA 480 

AAAAAAAAAA AAAAANTTCG NGGGGGGSCC GGTACCCATC CCCCTA 526 



40 

(2) INFORMATION FOR SEQ ID NO: 123: 

<i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 2081 base pairs 

{B) TYPE: nucleic acid 
(C) STRANDEENESS : double 
<D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

TGTACCGGTC CGGAAATTCC CGGGTCGACC CACGTCGTCS GGGGAACATG GCGGCTXCGG 60 

AGCCGGCGGT CCTTGCGCTC CCCAACAGCG GCGCCGGGGG CGCGGGGGCG CCGTCGGGCA 120 

55 

CAGTCCCGGT GCTCTTCTGT TTCTCAGTCT TCGCGCGACC CTCGTCGGTG CCACACGGGG 180 

CGGGCTACGA GCTGCTCATC CAGAAGTTCC TCAGCCTGTA CGGCGACCAG ATCGACATGC 240 

60 ACCGCAAATT CGTGGTGCAG CTGTTCGCCG AGGAGTGGGG CCAGTACGTG GACTTGCCCA 300 
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AGGGCTTCGC GGTRAGCGAG CGCTGCAAGG TGCGCCTCGT GCCGYTGCAG ATCCAGCTCA 360 

CTACCCTGGG AAATCTTACA CCTTCAAGCA CTGTGTTTTT CTGCTGTGAT ATGCAGGAAA 420 

5 

GGTTCAGACC AGCCATCAAG TATTTTGGGG ATATTATTAG CGTGGGACAG AGATTGTTGC 480 

AAGGGGCCCG GATTTTAGGA ATTCCTGTTA TTGTAACAGA ACAATACCCT AAAGGTCTTG 540 

10 GGAGCACGGT TCAAGAAATT GATTTAACAG GTGTAAAACT GGTACTTCCA AAGACCAAGT 600 

TTTCAATGGT ATTACCAGAA GTAGAAGCGG CATTAGCAGA GATTCCCGGA GTCAGGAGTG 660 

TTGTATTATT TGGAGTAGAA ACTCATGTGT GCATCCAACA AACTGCCCTG GAGCTAGTTG 720 

15 

GCCGAGGAGT CGAGGTTCAC ATTGTTGCTG ATGCCACCTC ATCAAGAAGC ATGATGGACA 780 

GGATGTTTGC CCTCGAGCGT CTCGCTCRAR CCGGGATCAT AGTGACCACG AGTGAGGCTG 840 

20 TTCTGCTTCA GCTGGTAGCT GATAAGGACC ATCCAAAATT CAAGGAAATT CAGAATCTAA 900 

TTAAGGCGAG TGCTCCAGAG TCGGGTCTGC TTTCCAAAGT ATAGGACATT TGAAGAACTG 960 

GTATGCTACT CACTGGTGAA GGACAGTCAG GTGAAGGACT GTAAGCCCAC ACAAGCTCTT 1020 

25 

CTTATCTCrA CTAGAATTAA AATGTTAAGT CAAAAACGGC TCCTTTTTTG CGCCTCCTAG 1080 

TGAAACTTAA CCAGCTAGAC CATTTGAGTA CCAGCATTTA GTTACAAACG TCAAAGGCTT 1140 

30 CCGGTGCTGC TTACCTTCCT TTTTTGTTAA TGTGCTTTTA TTTATTAAAA AAAATTACAA 1200 

TGAAGATGCC TGTTTTGTCT CTACTGTGTA CTCTGATCGT ATCTTTCCAA AGTGCAGACT 1260 

CTTGTGAAGT TTTCTTAAAT TGTTCACTTT AAAGAAAATG ACGTACCAAC AATGATTTGG 1320 

35 

CTTTTATATT ACTGTAAGAT GTTATAATGT TAATGTGGAT GTAGTGCTTT TACTTTACAG 1380 

ATTGATTGGA ATAAGATTAT TGCATATGAA TTTACCCACA GGACTCTGAA TCATGTTACC 1440 

40 CACTCCCCTC ACAATGTTGT CCACTTAGTG AGTTGCATTG ATCTATCCGT ACCAAATGAT 1500 

GTTGAATAAT TACATATCTT TCTTGACTAT ACTGATTTCT TATTTTGGTC ACTATTACTA 1560 

AATCTCTGTT AATATTCTCT CTTTTAACTG AAAAGGGATG GGATAGAAGG GTTTGCAATG 1620 

45 

CCATATTATT GGTGGAGGGC TGTTTTAACA TCTTTGAAGT ATGGCTTGCT GAATATCTTT 1680 

ACCAACATCT TGAATATATA TTCTAGTGTC CACAAGATTT AGCAAAAAGA TAAAGCTTGG 1740 

50 GTGGAATATC ATTTTAAAAT GTTCATGTTC TGTTCTATAT TTTCTTCACC TACTCTCCAA 1800 

ATATTGTAAT GCAAAAAGTC TCAGTAATGA TTTGGTAGTA TTAATTTTGT GGTCATTGTT 1860 

TCTCTTCGAT AAATTTATTT TCATTAAATA CTTRTTAGAG GGTTTTGAAA TGTTTTTCAA 1920 

55 

ATATGTGAAA TGTGAAACTG CTGTCTTTTA TATTAAAGTA ATTAAAGAAA ATGTATTGTG 1980 

ATTGAAATTA TTTTGNCCTC CACAAGATGG CTCTATGAGT ATTCTTCCAG GGATTCTAAT 2040 

60 ATTTATTTAA GGTNATAAAA TCTTGACATT TATAATCTTT C 2081 
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(2) INFORMATION FOR SEQ ID NO: 124: 



(i) SEQUENCE CHARACTERISTICS: 



10 



(A) LENGTH: 1717 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 



(xi ) SEQUENCE DESCRIPTION 



SEQ ID NO: 124: 



1 5 CCCCGGCGGA GCTGGACCCG CGGTGGGCTA GGGGCAGGGC CGGAGCCGCG GCGGCGGAGC 
TGTGGATCCT TCATGATGAG AGATTTGGGG ACACTTCTCT CTCCTGTGTG TAGTTGATAG 
TTTGGTGGTG AAGAGATGGC TGACAGTGTC AAAACCTTTC TCCAGGACCT TGCCAGAGGA 



ATCAAAGACT CCATCTGGGG TATTTGTACC ATCTCAAAGC TAGATGCTCG AATCCAGCAA 
AAGAGAGAGG AGCAGCGTCG AAGAAGGGCA AGTAGTGTCT TGGCACAGAG AAGAGCCCAG 
25 AGTATAGAGC GGAAGCAAGA GAGTGAGCCA CGTATTGTTA GTAGAATTTT CCAGTGTTGT 
GCTTGGAATG GTGGAGTGTT CTGGTTCAGT CTCCTCTTGT TTTATCGAGT ATTTATTCCT 
GTGCTTCAGT CGGTAACAGC CCGAATTATC GGTGACCCAT CACTACATGG AGATGTTTGG 



TCGTGGCTGG AATTCTTCCT CACGTCAATT TTCAGTGCTC TTTGGGTGCT CCCCTTGTTT 
GTGCTTAGCA AAGTGGTGAA TGCCATTTGG TTTCAGGATA TAGCTGACCT GGCATTTGAG 
35 GTATCAGGGA GGAAGCCTCA CCCATTCCCT AGTGTCAGCA AAATAATTGC TGACATGCTC 
TTCAACCTTT TGCTGCAGGC TCTTTTCCTC ATTCAGGGAA TGTTTGTGAG TCTCTTTCCC 
ATCCATCTTG TCGGTCAGCT GGTTAGTCTC CTGCATATGT CCCTTCTCTA CTCACTGTAC 



TGCTTTGAAT ATCGTTGGTT CAATAAAGGA ATTGAAATGC ACCAGCGGTT GTCTAACATA 
GAAAGGAATT GGCCTTACTA CTTTGGGTTT GGTTTGCCCT TGGCTTTTCT CACAGCAATG 
45 CAGTCCTCAT ATATTATCAG TGGCTGCCTT TTCTCTATCC TCTTTCCTTT ATTCATTATC 
AGCGCCAATG AAGCAAAGAC CCCTGGCAAA GCRTATCTCT TCCAGTTGCG CCTCTTCTCC 
TTGGTGGTCT TCTTAAGCAA CAGACTCTTC CACAAGACAG TCTACCTGCA GTCGGCCCTG 



AGCAGCTCTA CTTCTGCAGA GAAGTTCCCT TCACCGCATC CGTCGCCTGC CAAACTGAAG 
GCTACTGCAG GTCACTGAGT TGCCTGCCAT CCAAAGGGGA TGGGCGGGAT TGGAAGAAGC 
55 TGTGGCAGCT CTTTTCCCTG TTCACCTCCC GCCTGCCAGG GAAGGCAGGA CCCGCTCTGC 
CAAGGGCCCT CTGCGTATTC CCTTCTCTCT GAGGAATTGA AATTTTTGTC TCTGGTGCAC 
GTAAGGCAGA ATGTTCCCTG ACACCAGTGT GTGGATTTTT AACATCACCG TGAGTCTGAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
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AGGACCACAG g lTl ' l ' l CTGC AGCTATTTTC TAGCATTTGC CACTCCCTGT GCCTGGACTG 1440 

ATTGGAACAC TTTGTTTTTC TCCCTGTGCC ATTTACCCTT CCACCTTTCC ATCCTGCCTT 1500 

CTACCACCCT TGGATGAATG GATTTTGTAA TTCTAGCTGT TGTATTTTGT GAA1TTGTTA 1560 

ATTTTGTTGT TTTTCTCTGA AACACATACA TTGGATATGG GAGGTAAAGG AGTGTCCCAG 1620 

TTGCTCCTGG TCACTCCCTT TATAGCCATT ACTGTCTTGT TTCTTGTAAC TCAGGTTAGG 1680 

TTTTGGTCTC TCTTGCTCCA CTGCAAAAAA AAAAAAA 1717 



15 



25 



35 



45 



55 



{2} INFORMATION FOR SEQ ID NO: 125: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 804 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 
CCACGCGTCC GGTCACTATG TAGTGGAGGG GCAGACACCC TCCCGCAAAT TCTGGAAGGT 60 
TCTTAGTCTC GACTAGGGCA GTAGCCCCAG GACTCCTAGT CGCCGGCTTC AGGTCACTGC 120 
30 CGGCTGAACG GAGCTGCCGT CGCCATGTTT GGCTGCTTGG TGGCGGGGAG GCTGGTGCAA 180 
ACAGCTGCAC AGCAAGTGGC AGAGGATAAA TTTGTTTTTG ACTTACCTGA TTATGAAAGT 240 
ATCAACCATG TTGTGGTTTT TATGCTGGGA ACAATCCCAT TTCCTGAGGG AATGGGAGGA 
TCTGTCTACT TTTCTTATCC TGATTCAAAT GGAATGCCAG TATGGCAACT CCTAGGATTT 
GTCACGAATG GGAAGCCAAG TGCCATCTTC AAAATTTCAG GTCTTAAATC TGGAGAAGGA 420 
40 AGCCAACATC CTTTTGGAGC CATGAATATT GTCCGAACTC CATCTGTTGC TCAGATTGGA 480 
ATTTCAGTGG AATTATTAGA CAGTATGGCT CAGCAGACTC CTGTAGGTAA TGCTGCTGTA 540 
TCCTCAGTTG ACTCATTCAC TCAGTTCACA CAAAAGATGT TGGACAATTT CTACAATTTT 



(2) INFORMATION FOR SEQ ID NO: 126: 



(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 431 base pairs 



300 
360 



600 



GCTTCATCAT TTGCTGTCTC TCAGGCCCAG ATGACACCAA GCCCATCTGA AATGTTCATT 660 



720 



CCGGCAAATG TGGTTCTGAA ATGGTATGAA AACTTTCAAA GACGACTAGC ACAGAACCCT 
50 NTNTTTTCGN AAACATAATT TGAATAAAAT AATTTTTAAT GGATTNTGNA AAAAAAAAAA 780 
AAAAAAAAAA AAAAAAAAAA AAAA 804 



WO 98/39448 



348 



PCTAJS98/04493 



10 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

GGCACAGCCC AGGGCCTTGA AGCCAGCTGG CCCTGGAGAG GGGCTGCTGT GCCAGCTTGG 60 

GGAGGGTCTG GGATGGGGCT GCCCCTGATG GCCCTGATGT GGAGTACCTT GCCAGCATCT 120 

GCTGGGCTGA ACTTTATTTT AGCCCTTCCC TTGTTGCTCT TATGGAAGAA CAGAGGAGGG 180 

GTGGGCAGGT CAGTGATGTC AGCAGTGGAG TGATTCCCAG CACAGCGGCT TCTGGGAAGA 240 

15 GGGCATGGAG GGATTTCTTT CAGGGAAATG GTCCATNATT TCAGCCAGAA GGCATTGCAT 300 

TAAGTTAAGT CCNGGACTTT TGTGGCCCAG CTCTGTGTTA TTAAGGGCCC TTGGCGAAGA 360 

CITCAAGGAG GGGGCAAAAN GACCTTTAAG TTTTTAGGTT TAACACAGGG AACCCNCAAA 420 
GGGTTATTTT G 



20 



431 



25 



35 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 127: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3752 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 



60 
120 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 
NGGCACGAGG AGAGTCACCT GGACTCAGAA CTAGAGATAT CCAATGACCC AGACAAAATT 
AAACTTCAGC TTTCTAAGCA TAAGGAGTTT CAGAAGACTC TTGGTGGCAA GCAGCCTGTG 
40 TATGATACCA CAATTAGAAC TGGCAGAGCA CTGAAAGAAA AGACTTTGCT TCCCGAAGAT 180 
ASTCAGAAAC TTGACAATTT CCTAGGAGAA GTCAGAGACA AATGGGATAC TGTTTGTGGC 240 
AAGTCTGTGG AGCGGCAGCA CAAGTTGGAG GAAGCCCTGC TCTTTTCGGG TCAGTTCATG 
GATGCTTTGC AGGCATTGGT TGACTGGTTA TACAAQGTGG AGCCACAGCT GGCTGAGGAC 
CAGCCCGTGC ACGGGGGACC TTGACCTCGT CATGAACCTC ATGGATGCAC ACAAGGTTIT 
50 CCAGAAGGAA CTGGNGAAAG CGAACAGGAA CCGTTCAGGT CCTGAAGCGG TCAGGCCGAG 

AGCTGATTGA GAATAGTCGA GATGACACCA CTTGGGTAAA AGGACAGCTC CAGGAACTGA 540 
GCACTCGCTG GGACACTGTC TGTAAACTCT CTGTTTCCAA ACAAAGCCGG CTTGAGCAGG 600 
CCTTAAAACA AGCGGAAGTG TTTCGAGACA CAGTCCACAT GCTGTTGGAG TGGCTTTCTG 660 
AAGCAGAGCA AACGCTTCGC TTTCGGGGAG CACTTCCTGG ATGACACAGA GGCCCTGCAG 720 
60 TCTCTCATTG ACACCCATAA GGAATTCATG AAGAAAGTAG AAGAAAAGCG AGTGGACGTT 780 



300 
360 
420 
480 
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AACTCAGCAG TAGCCATGGG AGAACTCATC CTGGCTGTCT GCCACCCCGA TTGCATCACA 840 

ACCATCAAAC ACTGGATCAC CATCATCCGA GCTCGCTTCG AGGAGGTCCT GACATGGGCT 900 

5 

AAGCAGCACC AGCAGCGTCT TGAAACGGCC TTGTCAGAAC TGGTGGCTAA TGCTGAGCTC 960 

CTGGAAGAAC TTCTGGCATG GATCCAGTGG GCTGAGACCA CCCTCATTCA GCGGGATCAG 1020 

10 GAGCCAATCC CGCAGAACAT TGACCGAGTT AAAGCCCTTA TCGCTGAGCA TCAGACATTT 1080 

ATOGAGGAGA TGACTCGCAA ACAGCCTGAC GTGGACCGGG TCACCAAGAC ATACAAAAGG 1140 

AAAAACATAG AGCCTACTCA CGCGCCTTTC ATAGAGAAAT CCCGCAGCGG AGGCAGGAAA 1200 

15 

TCCCTAAGTC AGCCAACCCC TCCTCCCATG CCAATCCTTT CACAGTCTGA AGCAAAAAAC 1260 

CCACGGATCA ACCAGCTTTC TGCCCGCTGG CAGCAGGTGT GGCTGTTAGC ACTGGAGCGG 1320 

20 CAAAGGAAAC TGAATGATGC CTTGGATCGG CTGGAGGAGT TGAAAGAATT TGCCAACTTT 1380 

GACTTTGATG TCTGGAGGAA AAAGTATATG CGTTGGATGA ATCACAAAAA GTCTCGAGTG 1440 

ATGGATTTCT TCCGGCGCAT TGATAAGGAC CAGGATGGGA AGATAACACG TCAGGAGTTT 1500 

25 

ATCGATGGCA TTTTAGCATC CAAGTTCCCC ACCACCAAGT TAGAGATGAC TGCTGTGGCT 1560 

GACATTTTCG ACCGAGATGG GGATGGTTAC ATTGATTATT ATGAATTTGT GGCTGCTCTT 1620 

30 CATCCCAACA AGGATGCGTA TCGACCAACA ACCGATGCAG ATAAAATCGA AGATGAGGTT 1680 

ACAAGACAAG TGGCTCAGTG CAAATGTGCA AAAAGGTTTC AGGTGGAGCA GATCGGAGAG 1740 

AATAAATACC GGTTCTTCCT CGGCAATCAG TTTGGGGATT CTCAGCAGTT GCGGCTGGTC 1800 

35 

CGTATTCTGC GCAACCGTGA TGGTTCGCGT TGGTGGAGGA TGGATGGCCT TGGATGAATT 1860 

TTTAGTGAAA AATGATCCCT GCCGAGCACG AGGTAGAACT AACATTGAAC TTAGAGAGAA 1920 

40 ATTCATCCTA CCAGAGGGAG CATCCCAGGG AATGACCCCC TTCCGCTCAC GGGGTCGAAG 1980 

GTCCAAACCA TCTTCCCGGG CAGCTTCCCC TACTCGTTCC AGCTCCAGTG CTAGTCAGAG 2040 

TAACCACAGC TGTACATCCA TGCCATCTTC TCCAGCCACC CCAGCCAGTG GAACCAAGGT 2100 

45 

TATCCCATCA TCAGGTAGCA AGTTGAAACG ACCAACACCA ACTTTTCATT CTAGTCGGAC 2160 

ATCCCTTGCT GGTGATACCA GCAATNAGTT CTTCCCCGGC CTCCACAGGT GCCAAAACTA 2220 

50 ATCGGGCAGA CCCTAAAAAG TCTGCCAGTC GCCCTGGGAG TCGGGCTGGG AGTCGAGCCG 2280 

GGAGTCGAGC CAGCAGCCGG CGAGGAAGTG ACGCTTCTGA CTTTGACCTC TTAGAGACGC 2340 

ATTGCTTGTT CCGACACTTC AGAAAGCAGC GCTGCAGGGG GCCAAGGCAA CTCCAGGAGA 2400 

55 

GGGCTAAACA AACCTTCCAA AATCCCAACC ATGTCTAAGA AGACCACCAC TGCCTCCCCC 2460 

AGGACTCCAG GTCCCAAGCG ATAACACTGT CTAAGCACCC CCAAGCCACT ATCCACTTTG 2520 

60 AATCCTGCTC CATACATTGG GTGTATATTT ATTCTGAACG GGAGAAGTTA TATTGTTAAA 2580 
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AGTGTAAAAG AATAATTGTG TTATGAAGCT GCCTTATTTT TTTTCTTTTT GTAAGTTACT 2640 

ATTTTCATGT GAATATTTAT GTAGATAAAA TTTGCCTCCT GGTAACCCTG TAATGGATGG 2700 

5 

GGCCCAGAAA TGAAATATTT GAGAAAAACA AGTGAAAAGG TCAAGATACA AATGTGTATT 2760 

AAAAAAAAAA AAGCCTATTA ATAGGGTTTC TGCGCGGTGC AGGGTTGTAA ACCTGCTTTA 2820 

10 TCTTTTAGGA TTATTCCTAA ATGCATCTTC TTTATAAACT TGACTTGCTA TCTCAGCAAG 2880 

ATAAATTATA TTAAAAAAAT AAGAATCCTG CAGTGTTTAA GGAACTCTTT TTTTGTAAAT 2940 

CACGGACACC TCAATTAGCA AGAACTGAGG GGAGGGCTTT TTCCATTGTT TAATGTTTTG 3000 

15 

TGATTTTTAG CTAAAGAGAG GGAACCTCAT CTAAGTAACA TTTGCACATG ATACAGCAAA 3060 

AGGAGTTCAT TGCAATACTG TCTTTGGATA TTGTTTCAGT ACTGGGTGTT TAAAGGACAA 3120 

20 ATAGCTQCTA GAATTCAGGG GTAAATGTAA GTGTTCAGAA AACGTCAGAA CATTTGGGGT 3180 

TTTAAACTGA TTTGTTGCTC CCTATCCAGC CTAGACACCA GTAACTCTTG TGTTCACCAG 3240 

GACCCAGACC CTTGGCAAGG GATAGGCTCG TTGGTGACAT TGTGAATTTC AGATTTGTTT 3300 

25 

TATCCACTTT TTTTGCTATT TATTTAAATG GTCGATCAAC TTCCCACAAA CTGAGGAATG 3360 

AATTCCACGA GCCTGTTCTG AAAATGTGGA CGTAAGACAA ACACGTGCTC GTCCTTTAAT 3420 

30 GGAGTTCACC AGCACACTTG TTAACCAGTC CTGTTTGCTT TCGTCTTTTT TTGTGCGTAA 3480 

TAAAGTCAAC TGACCAAGTG ACCATGAAAA GGGGCTGTCT GGGGCTCCTG TTTTTTAGCT 3540 

GCTGTTCTTC AGCTCCGACC ATGTTGCTGT GTGATTATCT CAATTGGTTT TAATTGAGGC 3600 

35 

AGAAACTGAA GCTCTACCAA TGAACTGTTT AGAAACAAGA CACACTTTTG TATTAAAATT 3660 

GCTTGCAGTA ACAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAACTCGAGG GGGGCCCGGT 3720 

40 ACCCAATTCG CCGTATATGA TCGTAAACAA TC 3752 



45 (2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1144 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 
55 TGACCCTCTG CCTGCCGGGC TCAGTGCTGG ACGCTTTCTG TTTTGTCGCA GTCGGTCCTC 60 
GGTAACACCA GCGGCCTGTG GTCCACCACT CCATTCAGCA GCTCCATTTG GTCCAGCAAC 120 
CTTAGCAGCG CCTTCCCTTC ACCACTCCAG CAAACACGCT GGCAAGCATC GGCCTCATGG 180 

60 
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GCACAGAAAA CTCCCCTGCT CCTCACGCTC CCTCCACCTC CAGTCCAGCT GACGACTTGG 240 



GACAGACCTA CAACCCGTGG CGGATATGGA GCCCCACGAT TGGAAGAAGA AGCTCGGACC 300 



5 CTTGGTCTAA TTCGCACTTT CCTCACGAGA ATTAAATTAA GCAAAAAACA AACAAACATA 360 



GTGGGCCCTC GTCTAGATCA TGATGTGCCA GTTTCTGAGA CATCTTTTTA AGGCTCTTAC 420 



TGCAGCTCCC CTCCCCACCC TCCTCTTCTT TGCAAAACAG ACCCAAGCAG GGCAGGCTCA 480 

10 

GACCACTCGC TTCTTTCAGA TCTTTCTTGC AATTATGATA ACATGAGATT TGCTGTTGTG 540 



CTTTTAGAGA AAAGTCTGGA CTCAGCCACA AACTCTAATA AGACCTGTAC ATCTGAGAAC 600 



15 CTTTCCCGTT ACTGCGTTTT CACCACCTGT CTTCCCCATG CTTTATTTAT CTGTATGAAC 660 



ACAGATTTGA CATTACAGCT AAGGAAATAA TTTGAGTTGA TTCAGAAATC CTGGCATGTG 720 



ACAATTTTGT TAAATTACCA AGTTTGGTTT TTAATAATTT CTCAATATTA TGCGCCAAGA 780 

20 

TCTAATTTTA AAACTGTATG AGGACTTTGT GCTGAAAATA GAGTATTTTT TTAAAGTAAG 840 



GCTGTCTTOG TTTAAAAGCA GATTACAGAA ATGTAAGTCA ACTTAAGAAC RGTGAATGAA 900 



25 TGTAAAAACA TTCAGTYGAG ACCATATGCA TTTTCTGTGC TGTTTGTACT TGAGGTATGT 960 



AACATTTGTA TACCTGAACT TATTTTAAAG ATGAACTGAA ATGCACATAG CCAAGTCTTG 1020 



AGATACAAGA TTGAATGTGT ATTTCTTAAA AATACAACTT TGTGTTGTAC TTTGAAATAA 1080 

30 

ATGATGCTTT TTTCAAAAAA AAAAAAAAAA AAAAAAAAAC TCGAGGGGGG GCCCGGTACC 1140 



CAAT 1144 



35 



(2) INFORMATION FOR SBQ ID NO: 129: 

40 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1830 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 
GCATGCAGAG GAGCACCCTG AGCGTGTYCC TGGAGCAGGC GGCCATSTTG GCACGGAGCC 60 



50 ACGGGTTGCT GCCCAAGTGC ATCATGCAGG CCACGGACAT CATGCGGAAC AGGGCCCAAG 120 



GGTGGAGATT CTGGCCAAAA ACCTGCGAGT CAAGGACCAG ATGCCCCAGG GTGCTCCGCG 180 



CCTCTACCGC CTCTGCCAGC CGCCGGTGGA TGGGGACCTC TGAACACCCA AATGCCCCAC 240 

55 

GCTGGGCCGC GGCCTCTGGA GCTGGGATTT GGGAGGACAC AGCAGGCAGC GCTGGCCTTC 300 



TCCAGGGATG GCCCAANGCT TCCGCARCCG CCCGTTCCGG GACCTGCCCA GCGTCCTCCC 360 



60 TGCCTCCTTC CGGGACAAGC CTGGCCACCC TCGCTGTGAT GACGAGCTGG CTGATTGGCC 420 
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15 



CTGGGCCGGC CCATTCTTCA CACGCCTGCC AGAAGCTGGA GGGGTGCTGG AGACCCATAG 480 

AGCTGATGGG AGCAGCTGGT GCCTGGCCTT CGGCTCCTGC GTCCCCAGAA CCCAAGGGAA 540 

CGTCATGGAG GCCACATGGG GCCACCCGGC TCCCTCGGGA TGGCTCCGCT GCACTTTTGA 600 

AACCCCQGTT TCCTTCAACG TCCACATTCC AGGTGACCAC ACGTGTCTCC TCCTCCTCAT 660 

CTTAGCTTCC AGGTTCACCC TAACCCTGTA CTAACCTGCT TGGTGGACTT GGAAAAGACT 720 

TGGCTCTGTC GGGAAAGGAG AGACGGGGCC TCCATCACGC CTGTTACCAG AGGATCCCCG 780 

AGAGCCACAC CAGCTCTGGA CATCACCGCC CCTGGAACTG GGGCCACCAG CCCTGGGCAC 840 

GAGATTTGCT CTGACTTTAT TTATATGGCA TGAAATCTCT GGTTTATTTT GGGATTTTTT 900 

GTTGTTGGTG TTGTCAAAGT TTGTTTTTTC TAAAGTTGTG TGATTATATA TTTGACATTT 960 

20 TACATTTCAA AGAAAGGTAT GTTGTCTAAC AGGGGACCAA CAGAAGGTAG TATTGACAAC 1020 

TGTTCCTGCT TCTACTAAAA AAAAAAGAGC ACAAAAGAAA AACTAAATTA TTGAAAAATT 1080 

AAAAAATGTC ATTGTTTCCT GTTTGTTAAT ATTAGGGTTG TAAGGTGTCG TTTTGAGGTA 1140 

25 

TOGACTGTGA TTCCTTCCCC CACCCTCCAT TCTCCAGCGG TTGGCCGGTG TTAGAACTCG 1200 

CTCTCTTTGA GTGACTGGCT ACAAGGGCCT GAGAGGTGGC CAGCCAGGGT TGGAGCTGGA 1260 

30 GGGGATGGAG CCCCACCTGA GGTGCCGTGT CACACGGGTT AGAQGGTCAC TGGGAAACAC 1320 

CGGGCGGTGG CTTCTGTGAT TTATTTTCTT GATGGTAACT TCTCAGAGCA GGGCRATTGG 1380 

GACATCACCA GCCAGAGCAC AGGAAGCCAC CCTGCCTGCT GGGGAGGAGG GACCCACACA 1440 

35 

AGCCCCCTCG GCAGTTTGTC CCCCCAGCTT CGGTATGCCT TCAGGGAAAG GTCACAGCTG 1500 

GGGAGGAAGC GGGGGGACGC CTGTCACCCC TGGCAGGTGG TGAGTTCAGG TGGGGGCTCC 1560 

40 CTCCTKCCCC CAGGCCTGGG AGCTTGAAGC CCTCCCGGCA TCTGGCATCC GAGCCTCCCG 1620 

CCCTCCAGGG TGCGCTTCCC TCTCTTGCCG CAGCATACAC GAGGGCAGGC AGTGGCCTTG 1680 

TCACTGTATC TTGCATCAGA GACAAAGGAG GACCCGCTTT AGCCCTGCTG CGGGAAATGG 1740 

45 

GGGATGGCCC AGGGCCAGCG CATTGTGCAC TGGTTTACTT TAAAATGTAC AGATTCTTCT 1800 

CGTTAAATTC TTGATAGATT TTTTATTATT 1830 

50 



(2) INFORMATION FOR SEQ ID NO: 130: 

55 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1864 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

60 
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(xi) SEQUENCE DESCRIPTION: 
GGCCGCCCGG ATGGCGACCC CAGCCTCGGC 
5 TGTAGGTTAT AAGCTGAGGC AGAAGGGTTA 
AGCAGCTGAC CCGCTGCACC AAGCCATGCG 
CCGGCGCACC TTCTCTGATC TGGCGGCTCA 

10 

ACGCTTCACC CAGGTCTCCG ATGAACTTTT 
AGCCTTCTTT GTCTTTGGGG CTGCACTGTG 
15 ACTGGTGGGA CAAGTGCAGG AGTGGATGGT 
GATCCACAGC AGTGGGGGCT GGTTATCCCA 
AATTTGCAGT GAAATTTTAA GCGACTGTGA 

20 

GCTGGAAGCT ATCAAAGCTC GAGTCAGGGA 
GCTACAGAAC GAGGTAGAGA AGCAGATGAA 
25 GGTGATCATG TCCATTGAGG AGAAGATGGA 
TGTGGACTAT GGTGCAACAG CAGAAGAGCT 
CAACCGTGTT ACCATACTGT GTGACAAATT 

30 

AGAGTTCTCA GACAAAGAGT CAGTGAGGAC 
AGGAAGGCAA ATCAAGGTGA TCCCAAAACG 
35 CCGGGGTTTT CCACGAGCCC GCTACCGCGC 
TCGATTCTAC AGTGGTTTTA ACAGCAGGCC 
AGCGACATCA TGGTATTCCC CTTACTAAAA 

40 

AAAAGAGGAA AGAAGGAAAA AAAAAAGAAT 
MCCTTGATGG AAAAAAAATA TITTTTAAAA 
45 CCATAACTAA CTGCTGAGGA GGGACCTGCT 
GGCAGGGGGC TGCTTATTCA CTCTGGGGAT 
GCTTGCCCAT GTTTCCCTGC CCCACCCCAC 

50 

TTGCCTGGTG ATCTATTTTG TTTCCTTTTG 
TTGCAGGTTT CTGTAGCCGG AAGATCTCCG 
55 CCCTTCCCCC TGGGGAAATG CACTACCTTG 
TCAGTTGTTT TGTTTTTTTG TTTTTTTNTT 
AGGGAATGGG AGGAAGTGGG AACAGGGAGG 

60 



SEQ ID NO: 130: 

CCCAGACACA CGGGCTCTGG TGGCAGACTT 60 

TGTCTGTGGA GCTGGCCCCG GGGAGGGCCC 120 

GGCAGCKGGA GATGAGTTCG AGACCCGCTT 180 

GCTGCATGTG ACCCCAGGCT CAGCCCAACA 240 

TCAAGGGGGC CCCAACTGGG GCCGCCTTGT 300 

TGCTGAGAGT GTCAACAAGG AGATGGAACC 360 

GGCCTACCTG GAGACGCGGC TGGCTGACTG 420 

GATCACTGAA GCTGAGATGG CTGATGAAGT 480 

CTCTGCTGCA AGTTCCCCAG ATCTTGAGGA 540 

GATGGAGGAA GAAGCTGAGA AGCTAAAGGA 600 

TATGAGTCCA CCTCCAGGCA ATGCTGGCCC 660 

GGCTGATGCC CGTTCCATCT ATGTTGGCAA 720 

GGAAGCTCAC TTTCATGGCT GTGGTTCAGT 780 

TAGTGGCCAT CCCAAAGGGT TTGCGTATAT 840 

TTCCTTGGCC TTAGATGAGT CCCTATTTAG 900 

AACCAACAGA CCAGGCATCA GCACAACAGA 960 

CCGGACCACC AACTACAACA GCTCCCGCTC 1020 

CCGGGGTCGC GTCTACAGGG GCCGGGCTAG 1080 

AAAGTGTGTA TTAGGAGGAG AGAGAGGAAA 1140 

TAAAAAAAAA AAAAAAAAAA ACAGAAGWTG 1200 

AAAAGATATA CTGTGGAAGG GGGGAGAATC 1260 

TTGGGGAGTA GGGGAAGGCC CAGGGARTGG 1320 

TCGCCATGGA CACGTCTCAA CTGCGCAACT 1380 

CCCTCTTCTC CGGCTCCCTG CCCCTCCAGA 1440 

TGTTTCTTTT TCTGTTTTGA GTGTCTTTCT 1500 

TTCCGCTCCC AGCGGCTCCA GTGTAAATTC 1560 

TTTTGGGGGG TTTAGGGGTG TTTTTGTTTT 1620 

TTTCCTTTGC CTTTTTTCCC TTTTATTTGG 1680 

TGGGAGGTGG ATTTTGTTTA TTTTTTTAGC 174C 
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5 AAAA 



TCATTTCCAG GGGTGGGAAT TTTTTTTTAA TATGTGTCAT GAATAAAGTT GTTTTTGAAA 1800 
AKAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1860 

1864 



25 



35 



45 



55 



10 {2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2041 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

20 GGCACGAGCG CGCGGCAGGG CCCTGGACCC GCGCGGCTCC CGGGGATGGT GAGCAAGGCG 60 

CTGCTGCGCC TCGTGTCTGC CGTCAACCGC AGGAGGATGA AGCTGCTGCT GGGCATCGCC 120 

TTGCTGGCCT ACGTCGCCTC TGTTTGGGGC AACTTCGTTA ATATGAGGTC TATCCAGGAA 180 

AATGGTGAAC TAAAAATTGA AAGCAAGATT GAAGAGATGG TTGAACCACT AAGAGAGAAA 240 

ATCAGAGATT TAGAAAAAAG CTTTACCCAG AAATACCCAC CAGTAAAGTT TTTATCAGAA 300 

30 AAGGATCGGA AAAGAATTTT GATAACAGGA GGCGCAGGGT TCGTGGGCTC CCATCTAACT 360 

GACAAACTCA TGATGGACGG CCACGAQGTG ACCGTGGTGG ACAATTTCTT CACGGGCAGG 420 

AAGAGAAACG TGGAGCACTG GATCGGACAT GAGAACTTCG AGTTGATTAA CCACGACGTG 480 

TGGAGCCCCT CTACATCGAG GTTGACCAGA TATACCATCT GGCATCTCCA GCCTCCCCTC 540 

CAAACTACAT GTATAATCCT ATCAAGACAT TAAAGACCAA TACGATTGGG ACATTAAACA 600 

40 TGTTGGGGCT GGCAAAACGA GTCGGTGCCC GTCTGCTCCT GGCCTCCACA TCGGAGGTGT 660 

ATGGAGATCC TGAAGTCCAC CCTCAAAGTG AGGATTACTG GGGCCACGTG AATCCAATAG 720 

GACCTCGGGC CTGCTACGAT GAAGGCAAAC GTGTTGCAGA GACCATGTGC TATGCCTACA 780 

TGAAGCAGGA AGGCGTGGAA GTGCGAGTGG CCAGAATCTT CAACACCTTT GGGCCACGCA 840 

TGCACATGAA CGATGGGCGA GTAGTCAGCA ACTTCATCCT GCAGGCGCTC CAGGGGGAGC 900 

50 CACTCACGGT ATACGGATCC GGGTCTCAGA CAAGGGCGTT CCAGTACGTC AGCGATCTAG 960 

TGAATGGCCT CGTGGCTCTC ATGAACAGCA ACGTCAGCAG CCCGGTCAAC CTGGGGAACC 10 20 

CAGAAGAACA CACAATCCTA GAATTTGCTC AGTTAATTAA AAACCTTGTT GGTAGCGGAA 1080 

GTGAAATTCA GTTTCTCTCC GAAGCCCAGG ATGACCCACA GAAAAGAAAA CCAGACATCA 1140 

AAAAAGCAAA GCTGATGCTG GGGTGGGAGC CCGTGGTCCC GCTGGAGGAA GGTTTAAACA 1200 

60 AAGCAATTCA CTACTTCCGT AAAGAACTCG AGTACCAGGC AAATAATCAG TACATCCCCA 1260 
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AACCAAAGCC TGCCAGAATA AAGAAAGGAC GGACTCGCCA CAGCTGAACT CCTCACTTTT 1320 

AGGACACAAG ACTACCATTG TACACTTGAT GGGATGTATT TTTGGCTTTT TTTTGTTGTC 1380 

5 

GTTTAAAGAA AGACTTTAAC AGGTGTCATG AAGAACAAAC TGGAATTTCA TTCTGAAGCT 1440 

TGCTTTAATG AAATGGATGT GCCTAAAAGC TCCCCTCAAA AAACTGCAGA TTTTGCCTTG 1500 

10 CACTTTTTGA ATCTCTCTTT TTATGTAAAA TAGCGTAGAT GCATCTCTGC GTATTTTCAA 1560 

GTTTTTTTAT CTTGCTGTGA GAGCATATGT TGTGACTGTC GTTGACAGTT TTATTTACTG 1620 

GTTTCTTTGT GAAGCTGAAA AGGAACATTA AGCGGGACAA AAAATGCCGA TTTTATTTAT 1680 

15 

AAAAGTGGGT ACTTAATAAA TGAGTCGTTA TACTATGCAT AAAGAAAAAT CCTAGCAGTA 1740 

TTGTCAGGTG GTGGTGCGCC GGCATTGATT TTAGGGCAGA TAAAAGAATT CTGT3TGAGA 1800 

20 GCTTTATGTT TCTCTTTTAA TTCAGAGTTT TTCCAAGGTC TACTTTTGAG TTGCAAACTT 1860 

GACTTTGAAA TATTCCTGTT GGTCATGATC AAGGATATTT GAAATCACTA CTGTGTTTTG 1920 

CTGCGTATCT GGGGCGGGGG CAGGTTGGGG GGCACAAAGT TAACATATTC TTGGTTAACC 1980 

25 

ATGGTTAAAT ATGCTATTTT AATAAAATAT TGAAACTCAC CAAAAAAAAA AAAAAAAAAA 2040 

A 2041 

30 



(2} INFORMATION FOR SEQ ID NO: 132: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 2012 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 
TACCAAGCTG CAAGAATCTA CTATATCATG GCAGAAGAAG TAGAGTGGGA CTATTGCCCT 60 
45 GACCGGAGCT GGGAACGGGA ATGGCACAAC CAGTCTGAGA AGGACAGTTA TGGTTACATT 120 
TTCCTGAGCA AGAAGGATGG GCTCCTGGGT TCCAGATACA AGAAAGCTGT ATTCAGGGAA 180 
TACACTGATG GTACATTCAG GNTCCCTCGG CCAAGGACTG GACCAGAAGA ACACTTGGGA 240 

50 

ATCTTGGGTC CACTTATCAA AGGTGAAGTT GGTGATATCC TGACTGTGGT ATTCAAGAAT 300 
AATGCCAGCC GCCCCTACTC TGTGCATGCT CATGGAGTGC TAGAATCTAC TACTGTCTGG 360 
55 CCACTGGCTG CTGAGCCTGG TGAGGTGGTC ACTTATCAGT GGAACATCCC AGAGAGGTCT 420 
GGCCCTGGGC CAATGACTCT GCTTGTGTTT CCTGGATCTA TTATTCTGCA GTGGATCCCA 480 
TCAAGGACAT GTATAGTGGC CTGGTGGGGC CCTTGGCTAT CTGCCAAAAG GGCATCCTGG 540 

60 
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10 



20 



NAGCCCCATG GAGGACGGAN TGACATGGAT CGOGAATTTG CATTGTTGTT CTTGATTTTT 600 

GATGAAAATA AGTCTTGGTA TTTGGAGGAA AATGTGGCAA CCCATGGGTC CCAGGATCCA 660 

GGCAGTATTA ACCTACAGGA TGAAACTTTC TTGGAGAGCA ATAAAATGCA TGCAATCAAT 720 

GGGAAACTCT ATGCCAACCT TAGGGGTCTT ACCATGTACC AAGGAGAACG AGTGGCCTGG 780 

TACATGCTGG CCATGGGCCA AGATGTGGAT CTACACACCA TCCACTTTCA TGCAGAGAGC 840 
TTCCTCTATC GGAATGGCGA GAACTACCGG GCAGATGTGG TGGATCTGTT CCCAGGGACT 
TTTGAGGTTG TGGAGATGGT GGCCAGCAAC CCTGGGACAT GGCTGATGCA CTGCCATGTG 



900 
960 



15 ACTGACCATG TCCATGCTGG CATGGAGACC CTCTTCACTG TTTTTTCTCG AACAGAACAC 1020 



1080 
1140 



TTAAGCCCTC TCACCGTCAT CACCAAAGAG ACTGAAAAAG CAGTGCCCCC CAGAGACATT 
GAAGAAGGCA ATGTGAAGAT GCTGGGCATG CAGATCCCCA TAAAGAATGT TGAGATGCTG 

GCCTCTGTTT TGGTTGCCAT TAGTGTCACC CTTCTGCTCG TTGTTCTGGC TCTTGGTGGA 1200 

GTGGrTTGGT ACCAACATCG ACAGAGAAAG CTACGACGCA ATAGGAGGTC CATCCTGGAT 1260 

25 GACAGCTTCA AGCTTCTGTC TTTCAAACAG TAACATCTGG AGCCTGGAGA TATCCTCAGG 1320 

AAGCACATCT GTAGTGCACT CCCAGCAGGC CATGGACTAG TCACTAACCC CACACTCAAA 1380 

GGGGGATGGG TGGTGGAGAA GCAGAAGGAG CAATCAAGCT TATCTGGATA TTTCTTTCTT 1440 

30 

TATTTATTTT ACATGGAAAT AATATGATTT CACTTTTTCT TTAGTTTCTT TGCTCTACGT 1500 

GGGCACCTGG CACTAAGGGA GTACCTTATT ATCCTACATC GCAAATTTCA ACAGCTACAT 1560 

35 TATATTTCCT TCTGACACTT GGAAGGTATT GAAATTTCTA GAAATGTATC CTTCTCACAA 1620 

AGTAGAGACC AAGAGAAAAA CTCATTGATT GGGTTTCTAC TTCTTTCAAG GACTCAGGAA 1680 

ATTTCACTTT GAACTGAGGC CAAGTGAGCT GTTAAGATAA CCCACACTTA AACTAAAGGC 1740 

40 

TAAGAATATA GGCTTGATQG GAAATTGAAG GTAGGCTGAG TATTGGGAAT CCAAATTGAA 1800 

TTTTGATTCT CCTTGGCAGT GAACTACTTT GAAGAAGTGG TCAATGGGTT GTTGCTGCCA 1860 

45 TGAGCATGTA CAACCTCTGG AGCTAGAAGC TCCTCAGGAA AGCCAGTTCT CCAAGTTCTT 1920 

AACCTGTGGC ACTGAAAGGA ATGTTGAGTT ACCTCTTCAT GTTTTAGACA GCAAACCCTA 1980 
TCCATTAAAG TACTTGTTAG AACACTGAAA AA 

50 



2012 



55 



{2) INFORMATION FOR SEQ ID NO: 133: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1669 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
60 (D) TOPOLOGY : linear 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

GAGCAGTATT TTAACCAACT TGTATTACAG ATGTTACAGT TCATGTTAGG AAGTCAGAAA 60 

5 

AGACTTTGTT TGTCTTTGTT CTGCTGATGT GAGTCATGTT TTGTGGGGTC TTCCATGGCA 120 

CATTTACCTG TTGCTCCGTC CAGATGTTGA GGGCCAGTCT AGGCTGACAC ATCCTACCCG 180 

10 AGGACAAGCC TGTTCTCCAT TTCTTCACTC TCCCCTCCCC ATATAGCAAC TCTCCCAGGT 240 

TTAGATTACC GTTTTCGACG ACAGATTAAC CAAAAATGCC CCACACAGGT TTTATTACTG 300 

TTATATACTA TACTTTTAAC AGTACAGACC CTAAATTTTA TTATTTGTTG CTCCCCCAAT 360 

15 

CTGATACCAA ATGTTTAAAG TTGTTTGAAA TCCAAACATG GTAGTGTTCA TGGGTAAATA 420 

TTTTCTAGGC TATGTAAGAG TTAGCAGCCC ATAGCATAGA AGTAATCAAG TAGCATCTGA 480 

20 GACTGTTGGA GGCACTAGGG CCTCTCTGGG CCTAACAGCC TCACTTCCCC AGCCTCACCT 540 

TGCTGTCCTC TGACACTGCC ATCAGGGCTG TTAGTGGCAC CTGTATGAGG CCAAGTGTGC 600 

GTCCAGGGGA ACAGCACAGG TTAATGCGTC TCCCTAGAAC TCATGAAGTC AGTTTAATTC 660 

25 

ATGCATGAAC ATGAGTTCAT TTTATGTTTT ATATAGCTTT CTTAGACATA CCAAACCATC 720 

ATTCATAAAT CAGATAAATT ATTCAGTITT TGTGTTTAGA AAGCTAAGTA TGTGTAGCTG 780 

30 GAAACAAAAA TGAGCGTGTT TTCTCTCCTG TTAATCTAGA GTGTGCAGTT ACACATGTGT 840 

GGATAATTTC ATGTTCCAGG GGCGCTTGGC ATCTCCCATG GACTGATTCC CAGGAAGAAA 900 

AGCCCAAAGG GAAACCCACG ATTCCTTTCG AGTAGATGTG GGAAAGAGCC CATTGGAGGA 960 

35 

TATGAGGTCC TGTGAAATTC AGTTGTGTGT GTGGCTCCTT GTTAGCAGTC ATGTTGACAT 1020 

GGTGTTAGGA GGCTCCCCAT CCACCCTTTA CATGATGTAG GGACCAGTGT CTTGTGAGAT 1080 

40 TAACCTTGGG ACACAGTGGG TTAGCCTGGA GAAAATGAGA GGCCCTGCCT GGACCCAGGG 1140 

AGAGGAGCCA GTGACACAGG CAGAGCGGTG CAGCCCTCCT TCCCTTCCAT TTGGAGGAGG 1200 

TGGTGCCAGG AGCCTGCCCG CTTACCTCTG CTGAAGCATA AGTGGACTTT GCTTTTGGGG 1260 

45 

CTTATCTCTG ATACATGCTG GAGCCCTGCC TCTCCACTGC TAGATGGAAC CTGGAATCTC 1320 

TCATCTACCT CTTAGTCTGT CAGTTTCTAC GTGTGAGAAG CAAGCTTGTG GGCCAGTGTC 1380 

50 CTTGTACATG CTGTAGCACT TAAAAAATAA TTCCAGGGTT CCCTGGAAAA CCAGTCCCAG 1440 

GGTTCCTATG ATCTGTAGTT TCTACCTGGA TTATAACTGG TTTTGGGTAC CTGAATTTTG 1500 

ATTGGTTAGC CTTAATTATA GTCTGGCGTG ATCATGTAGA ATCTTTTCTG GTGAACAGAT 1560 

CATAAAGTTC TATCAAGGAG TTCTATCAAG GCATCCATGT CAGTGGTGCT ATGCTGGTTA 1620 

CAACTTGAGA TTTTTGAAAT AAAAAATTTG TCATAAAAAA AAAAAAAAA 1669 



55 



60 
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(2) INFORMATION FOR SEQ ID NO: 134: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1565 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 
CACTTTTGCT ATATAACCTA AGTGATAACC CTCTTTTAGT TACCTGCCAA ACTCTGGNCT 60 
15 TGGTTTATAT TGCAGTTAAC ACAGTTACAA AGCTGTAATG GTGTCTTTTT TTCCTTTGTA 120 
ACGGAATGTG TAAATCAAAG TATATACATT GTGTGGTGTT CCTGTTTCTG GAGTTTCATG 180 
AGGATTTACA CATGGCATTC AGTGTTCTGT ATAGATCTGC CTACCTTTGT GAATTCATCT 240 

20 

GTTAACCCCT CTTCCTTTGA GAGAGCACCG GCGATGGTGG TTAACTCCTT GTGTTTTCTC 300 
TCTCTCCTAC TGGTTATTCT TGAATTAAGC ACAGACTCGT CAGCTCGGTT GCTTTATCAT 360 
25 GAATAATGTG TGTGACCTTG CAGTTCTTCC ACAGTTCAGC AAACAAGTGC TAGCTTCACT 420 
GACCAAAAAT TAAGGAAGGA AAACACAGTT TTTAAAACGA TCCATCTTTT AACAGCCGAA 480 
ACCGATGTGT CTATGGTGCT GCACCTTGCT GTTGTACTTC TGAAATCAGA CGTGTGTGAA 540 

30 

CGATCATTTC TGACTTAACC GTGAGATGCT CACGAGTACC CTTCCTGTTG TTTTGTTAGC 600 
ATTGAAATCG AGACTATTTA TTTGGAATAT ATACAACAGT GTTTTTCCAC TGTATTTCAT 660 
35 TTGCAAAAGT TGAGAACTGC TTTCTCTACC TTTTGCAAAA TAATTGATAT TCCATATTGG 720 
ATTCTCAAAG ACTTCGATAT GGTGAACCTA TTAAACCTAG AAATTGTATT CATCCTTTCA 780 



TGACTGTGGC CTGAGTTCCC CAGCCCCTCT CCTCCTTTTT TTTAGATGAG ATTTAGCACA 840 

40 

CTCTCAGTTA TTTAAACATG CAACATTTCT TGAGTATGTA TGTTGAGGCC ATCTGAGCTC 900 

ATAGCTGATT CAGTAACCAG TTTCATGCTG TGTCATTCAC ACTCACTACT TAATACTGCC 960 

45 ATGGTGAAAA TGTGGAGGAA AAATGTATCC ATGTGTGTCT GGGAAGCATA TACACTTGTA 1020 

CATTTTTTAA TACTCTGATT CTGTAACATT TCTGAGTTTT GTTTTGTTTT ACAGNAAAAA 1080 



AAAAAAAAGT GATAAAGCAA TCAGAAGACC AAGAGGTTTA CTATTGATGC TTAGGGTCGT 1140 

50 

CTGACCTTGG CTGGCCAATA GACCTACACG GCCAAATTAA TTTACGAGAG TAATAATTTT 1200 

TCAAAAGCCA ATTTTTTTTC TGTATTTTCT GTATGAAACT GCCAATATCA TGAATAGAAA 1260 

55 GGGAGAACCA TAAAGGAGAA AGAACGTGAT GTTCTGTTAT GTTCATGTAA ACCTAAAGAA 1320 

ACAGTGTGGA GGCAGGCGCG ATCAGCCGAA CTCTAGGGAC TTGGTGTTGC TTGGAAGGCA 1380 

TCCATACCTG CATTTTGCAT TCTTCGTATG TAATCATATT GCCAAAGACA AACTATTTCA 1440 

60 
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TCATTTATTG TAAATAACAC TTTTCCCCAG ACCTACCATA AAGTTTCTGT GATGTATTGT 1500 
CTTCCAGTTG CAATAAAAAT TACTGAGTTG CATCAATTGA AGAAAAAAAA AAAAAAAAAA 1560 
CTCGA 156 5 



10 (2) INFORMATION FOR SEQ ID NO: 135: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2007 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

TCTAAAAGCC CCCTTATACC CCACTTTGTG CAGCAAAGAT CCCCGTGCAG GTCACAGCCT 60 

GATTTGTGGC CAGGCTGGAC AAATTCCTGA GGCACAACTT GGCTTCAGTT CAGATTTCAA 120 

GCTGTGTTGG TGTTGGGACC AGCAGAAGGC AAACGTCCAG CCAACACACA GGACTGTAAG 180 

AGGACTCTGA GCTACGTGCC CTGTGAAGAC CCCCAGGCTT TGTCATAGGA GGTCGTTCAG 240 

CTTCCCCAAA GTCAGAGGTG ATTTGATTTG GGGAAGACTG AATATTCACA CCTAAGTCGT 300 

GAGCATATCC TGAGTTTTAC TTCCTTATGG CTTGCCCTCC AAGTTCTCTC TCTCATACAC 360 

ACACACACCC TTGCTCCAGA ATCACCAGAC ACCTCCATGG CTCCAGCTAT GGGAACAGCT 420 

GCATTGGGGC TGCCTTTCTG TTTGGCTTAG GAACTTCTGT GCTTCTTGTG GCTCCACTCG 480 

CGAGGCAGCT CQGAGGTGTG GACTCCGATT GGGCTGCAGG CAGCTCTCGG ACGGCACAGG 540 

GCGGGCGCTC TGATCAGCTC GTGTAAAACA CACCGTCTTC TTGGCCTCCT GGCAGTTCTT 600 

TCTGCGAATA GTCCTCTCCC TGGCCAGTTG AATGGGGGAA GCTGCTGGCA CAGGAAGGAG 660 

AGGCGATCCC GGCTGAGGCT TAGGAAATTG CTGGAGCCGG CTCCAAGCAG ATAATTCACT 720 

GGGGAGGTTT TCAGAGTCAA ACATCATTCT GCCTGTKTTG GGGQCCAGGT GTGTCACACA 780 

AGCATCTCAA AGTCAAAAGC CATCTGGGGC TGCTGCTTCT CTTTCTCAGG CTCTGGGGAA 840 

AGGAATCTCC CTCTCCTCTC ACTTGATTCC AAGTGTGGTT GAATTGTCTG GAGCACTGGG 900 

ACTTTTTTTC TCTTTTCCTT GATGGACCAA CAGTGCAAAT GCAATCTCGC CATTTAACTT 960 

TCAGGTCGAT TTCCTTTCCT GATCAGACAT CTTTGTGCCC CCTTTAGGAA GGAAAAGAAT 1020 

ACACCTACGA TGTGCCAGGC ACTGTGTTAG GCGCTTTTAT ATAGATCCTC GTTAGGATGA 1080 

GACTAAGGGA TGAGGACATC TCTTTATAAA AGGCCCCTAA GTAATGGATA AACAGAAACA 1140 

CTTAGAGGTG AGAAGGTCTG TCTTCAAGAT CCAAGGTAAG ATTGCCTTCA GTCTGATGTT 1200 

TGTTCTCAAG GACTTATCCC CTACAATATT CTCCCACTCC ATACTTCTCC TTCTACCCCA 1260 



WO 98/39448 



360 



PCT/US98/04493 



10 



15 
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25 



30 



40 



CCATCTGCTC CCGTGCACTC CTCAGATGGT CAGAGGGGTA ACCCAAGTCC TTAGAGAATT 1320 

TGGGGACCAA TAGAATATGT GATGTGTGAA TTTTCTTTAA AAAACTTAAG GAGTCTTTGC 1380 

TACCTTCTGC TTGTTGAGTT GTTTTGGCAT TCATATTAAA AGCCAGCATC TCACTATTTA 1440 

TTGACAGGTT GGGCTGTGTG TGTGCGCATG TGTGTATACA TTTCCAGGCG TGC CTGTGTC 1500 

CTGTAGCTTT TTAAAAGGAA ACCCAGTCAT CCCACTATGA ATCTGGCATC TTCTTATGCT 1560 

TCTAGTGTTT TGGCCATACA TCAACCAAGG GGTTTAATTT ATCCAATGOT TGACGACATG 1620 

TTCAGGAGGG GCTGGATCAA ATTTTGAGAG GGTTATGGGA AAGGGAGGGG GAGAAGAAAT 1680 

TGACATTTAT TTTATTATTT ATTTTAAATG TTTACATCTT CTTTATGTTG TATCAAGCCT 1740 

GAATAGAAAC TGATAGCATT AAAATACTCC GTTCCTCTCT CTCTTCTCGC TTCCTTTTTT 1800 

TTTTTTTTTA AATTTAGGAT AACACATTTT TGTTTCTAAA GTGATTTGTG ATTTGTGCTG I860 

TATAAACTGT ATAAAAGGTT CTGTTTTTAA AGGTGGATTT TCATTCCTCT GGGGACAGTG 1920 

GTGGCCAAGA CATCTACATT GTAAGAGAAC ACAGTGGAAG ATCCTGTCCT GATTCTCAAA 1980 

AATTATTTTC TCTGTATGAT TAAAAGT 2007 

(2) INFORMATION FOR SEQ ID NO: 136: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1291 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

CTTTTAACCC TCCCCCTTCA CACACATACA TATCAGGTTC TTTTCTAGTT AAAAACCCAA 60 

GTAGCTCAGA TTCTACTTTA ATGTCAGTGC AGATTTGCAT TGAATCATGC CATTATGTTT 120 

45 TTTCTCATTT TTATGCTGTT GGGTCTTAGT TTTTAAATTG ATATAAAGAA CTCAGCAATG 180 

GTTTTATTTT CT ACT CAT AC TTAGGGTTTA GGAAACACTA CCACTAGTTA TCATTTAATC 240 

^ AACTTCAATG GTCTACTGAA ACAAAAATGG TAACTTTTCA TTAGTGGATT ATTTAGAGTT 300 

ATAGTAGTTG TTTCCAGAAA ACACTTCCTC ACAATTGTAC TTCCCAATCA AATCATGTGA 36C 

TCATACAGTT ATTCCCATGA AAGGCAGAAT GTTTGTTTCA AAATTAATCT AGTTTTCTGT 420 

55 ACATTTAAAT TTGAGAAGGT GACAACTGGC TCTTTTCCAG TCTTCCTTCA TGTCAGTTTT 480 

CTGATAGACC ACTATTGGCA AACAGTATCT GTCAACTACC AAATGTGTAA AATTTTCTGT 54 C 

ATTTCACTTT GTCTTATTTG TAAATAGTGA ACTAAAACTT TTGGCAGATC AGCAACATTT 600 

60 
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10 



GCTGAGCCTG TTTTTTAAGC TAATGTGTAT TCTTACTAAT GTTCCTATCA AGAATGGATT 660 

TGTAATATAT GCTGTCTATT TCTAATGTTC ACATTCATAT TTTGAGGTTC TATCTTATTT 720 

TAATAGAGAA CAGACTTCTC AAAAAATCTT CAGAAGCAGC TTATTATTGA AATATCGAAA 780 

TATTGAAATA AACCCGGTGG GTTAGATTAC TCATCTGTCC ACCAAGTGGG ACATTTGCAT 840 

GGACTGGGGG CTTAAAGGAC TTAGAAGAGA CCTGTAAGTA AATCCTGAAA ATGAGCCAAT 900 

CCCCACTTGA ATGGTTACTG GAGTAAACCC ACCTTTACCA CCCCAATTAC AGCACCCGAG 960 

GCCGATAAAC CAACTTGGCT CTGGTTCATT TTTCTTTTCT TCATTTGTGA TGCTCAGATT 102 0 

15 CAAAATGTGT GTTCTACACT GTTACAGGCT TCTCTTTTGT TTGATTAAAG ATTTTAGTCC 1080 

TACTTTTGTA TGGACACATT AGAATATTCA GAGACCAAAA TAGAAGAATT TGCTGTTAGA 1140 

TATTTTTCAG AAGTCAGCAG ATTTGTGGCA AATCATTTAT TTGCCTTTTT AAAAATTCAT 1200 

20 

TTAAGCAGTT CAGAGAGTAG ACTACTCAGA AAATTATTTC ACGTAATTGT CTAAGAGGTC 1260 

AATATTTTTT AATGCATATT GAATCAAATA A 1291 

25 

(2) INFORMATION FOR SEQ ID NO: 137: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1906 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNKSS : double 
<D) TOPOLOGY: linear 

35 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

GGCACGAGGA CCTACTTTTG TAACAGACCA TGGTTGTGTC CAAGGTAAAA CCACAGTGAT 60 

40 ATTTTTGGAT GCTTTGTCTG CAATCTTGAC TTGTTTTTGC AGTATCATTA TTCAGACTTC 120 

AAATTGTGAA TCTTTTAAAC ATCTTGATAA TTTGTTGTTG AGAGCTGTTC ATTCTAAAAT 180 

GTAATGAAAT TCAGTCTAGT TCTGCTGATA AAGATCATCA GTTTTGAAAG GTTACTGATT 240 

45 

rrcCTCTTCC CTCTTAGTTT TTTACCCAAT ATATGGAGAA GAGTAATGGT CAATCTTAAC 300 

ATTTTGTTTT AATTGTTTAA TAAAGCTGCT GGGCAGTGGT GCAGCATTCC TACCTAGTGT 360 

50 CAT AAAAGC A AAATACTTAC ATAGCTTTCT TAAAATATAG GAATGACATT ACATTTTTAG 420 

GAGAAAGTAA GTTGCTTTGC ACCGCCTACT TAATTCCTTT CCATATATTG TGATACAAAC 480 

TTTTGAATAT GGAATCTTAC TATTTGAATA GAAATGTGTA TGTATAATAT ACATACATAC 540 

ATAAGCATAT ATGTGTGTGT GTGTGTGTAT ATATATATAT ATGCATGCTG TGAAACTTGA 600 

C T ACACAAC A TAAATCACTT TTTAAATTCC AGGAACGGGT AGTCTGACAC GGTGATTATC 660 

60 CTTTTGAGGC TGAATCCGTT ATTAACTTGT TATTTAGGTT TTACTCCCAG TAGCAAGGGA 720 



55 
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TTCTAACTTA GTTGCACTTA CATGATTATT GTGATTTAAA ACTAAGAATA AAGGCTGCAT 780 

TTTCAAAGAT AAATTGGAAT TGCTGTTGGT GAAATAACAA CCAAAATACT GAATCTGATG 840 

5 

TACATACAGG TTTCTACAGG AAGAGATGGT ATAATTTACA ATTTGGAGAT TTAATAACCA 900 

GGGCTACCCA GAAAAAGTGA CTTGATAACA TGGTACCAAT AAGTAAGGGA TGCTCTCTCG 960 

10 GTTTGCTTTT GCCACTTTCA AGATTTTAAC TTCTCAGGTT ATTAATGAAA ATTATTGTAT 1020 

AAGTTAGCCA ATAGAATTTT TAGGTTAAAA CAACAGATGG GGGGTTTGTG GAGTGTTTAA 1080 

TGTCATGGGC ATTTTTAGTA GCATAGACCC TTTGTTCTGC ATTTGAATGT TTCGTATATT 1140 

15 

TTTGTTTCAC AGTTAATCTT CCCTCCCCAA GTTTGCTATT CAAATCAACT GCCTGAATGA 1200 

CATTTCTAGT AGTCTGATGT ATTTTTCTGA GGAATAGTTT GTGATTCCAA TGCAGGTGTC 1260 

20 TTCATTACCA TTACCTCTAC ACTGCAGAAG AAGCAAAACT CCTTTATTAG AATTACTGCA 1320 

CATGTGTATG GGGAAAATAG TTCTGAAAGG CTAGAATGAT ACAAGTGAGC AAAAGTTGGT 1380 

CAGCTTGGCT ATGGAGTGGT GGCAATAATC TCTAAACATT CCAAAAGACC ATGAGCTGAA 1440 

25 

CCTAAACTCC CTTGGGAATC TGGAACAAAG GAATATGAAA ATTGCCATTT GAAAACTGAC 1500 

CAGCTAATCT GGACCTCAGA GATAGATCAG CCAGTGGCCC AAAGCCATTT CAAGTACAGA 1560 

30 AATTATAGAG ACTACAGCTA AATAAATTTG AACATTAAAT ATAATTTTAC CACTTTTTGT 1620 

CTTTATAAGC ATATTTGTAA ACTCAGAACT GAGCAGAAGT GACTTTACTT TCTCAAGTTT 1680 

GATACTGAGT TGACTCTTCC CTTATCCCTC ACCCTTCCCC TTCCCTTTCC TAAGGCAATA 1740 

GTGCACAACT TAGGTTATTT TTGCTTCCGA ATTTGAATGA AAAACTTAAT GCCATGGATT 1800 

TTTTTCTTTT GCAAGACACC TGTTTATCAT CTTGTTTAAA TGTAAATGTC CCCTTATGCT 1860 

40 TTTGAAATAA ATTTCCTTTT GTAAAAAAAA AAAAAAAAAA AAAAAA 1906 



35 



45 (2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1935 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 8: 
55 TCTGAACTAA TGCTAACAGA TCCCCCTGAG GGATTCTTGA TGGGCTGAGC AGCTGGCTGG 60 
AGCTAGTACT GACTGACATT CATTGTGATG AGGGCAGCTT TCTGGTACAG GATTCTAAGC 120 
TCTATGTTTT ATATACATTT TCATCTGTAC TTGCACCTCA CTTTACACAA GAGGAAACTA 180 

60 
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TGCAAAGTTA GCTGGATCGC TCAAGGTCAC 
CTCAGCTCCT CAGGTCAGCA AGTCTACTTC 
5 GTGCCTAGCT TTGGAAAGTC TAGAATGGGT 
GTTTCTGCCT CTTTTTGGAA AAGAAAACAA 
CAATAGCATG AGGTGAACAG GACGTAGTTN 

10 

AAACACTATC TTCCCATCTG TTTCTCAATG 
GGAGAGCAGC AGTTAAACCC GTGGATTTTG 
15 CACTAATTGG CTATGTCTCT GGACAAGTTT 
AACTTTCACT TTCTATGTCT ACCTCAAAGA 
TAAAGGGTCT GCCAGATAGG AAGATGCTAG 

20 

AGAGTCTAAA ACCTACAGTG AATCACAATG 
AAAACTAGCC AGAAGTCTCT TTTTCAAATT 
25 TAATGGATAA TCTTATTTAT CTAAACTAAA 
TGGGATAAGA TAAATGACCA CAGTACCTTA 
TGTAGGTAAG GACATTTTCT YTTTTTCAGC 

30 

GAGGGGGGTA GCATGCACCC AGCAGGGGAC 
GTGGGGTAGT GAGCTGCCTT TCTGTGATOG 
35 TTGCTATTCG CAGCTACATA CAACGTGGCC 
AG'TGCTGGTG CTGACTCTCA ATAGCCCCAC 
AATTAGGCAA CCTAAAATAT TGATGCTGGT 

40 

TGAAACTTAG AGTTATAATT CATGTATTAG 
ATATGTATAT ATGAAAGGGA GGTTATTAGG 
45 GTCGCACAAT AGGCCGTCTG CAAGCTGGGT 
GTTCAAAAAC CTCAAAACTG GGGAAGCTGA 
AGGCCAAGAG CCCCTGGCAA CCAACCCACT 

50 

AACCTGGAGT CTGATGTCCA AGAGCAGGAA 
CAAGGTAGAC AGTGTCTACC ACCAYAGTGG 
55 GCTACCTGGA TCCCTGAAGT TGCCCTGGTC 
CTTTCCATTA CATGAGCTGT CTCAAAGCCC 
AAAAAAAAAA AAAAA 

60 



TTAGGTAAGT TGGCAAGTCC ATGCTTCCCA 240 

TCTGCCTATT TTGTATACTC TCTTTAATAT 300 

CCCTGGTGCY TTTTTACTTT GAAGAAATCA 360 

AGTGCAATTG TTTTTTACTG GAAAGTTACC 420 

AGGCCTTCCT GTAAACAGAA AATCATATCA 480 

CCTGCTACTT CTTGTAGATA TTTCATTTCA 540 

TAGTTAGGAA CCTGGGKTCA AACCCTCTTC 600 

TTTTTTTTTT TTTTTTTTAA ACCCTTTCTG 660 

ATTGTTGTGA GGCTTGAGAT AATGCATTTG 720 

TTATGGATTT ACAAGGTTGT TAAGGCTGTA 780 

CATTTACCCC CACTGACTTG GACATAAGTG 840 

ACTTACAGGT TATTCAATAT AAAATTTTTG 900 

GCTTCCTGTT TATACACACT CCTGTTATTC 960 

ATTTCTAGGT GGGTGCCTGT GATGGTTCAT 1020 

AGCTGTGTAG GTCCAGAGCC TCTGGGAGAG 1080 

TGAACTGGGA AACTCAAGGT TCTTTTTACT 1140 

GTTTCCCTAG GGATGTTGCT GTTCCCCTCC 1200 

AACCCCAGTA GGCTGATCCT ATATATGATC 1260 

CCAAGCTGGC TATAGGTTTA CAGATACATT 1320 

GTTGGTGTGA CATAATGCTA TGGCCAGAAC 1380 

GGTTCTCCAG AGGGACAGAA TTAGTAGGAT 1440 

GAGAACTGGC TCCCACAGTT AGAAGGCGAA 1500 

TAGAGAGAAG CCAGTAGTGG CTCAGCCTGA 1560 

CAGTGCAGCC AGCCTTCAGT CTGTGGCCAA 1620 

GGTGCAAGTC CTAGATTCCA AAGGCTGAAG 1680 

GAGTGGAAGA AAGCCAGAAG ACTCAGCAAA 1740 

CCATACCAAA GAGGCTACCG ATTCCTTCCT 1800 

TCTGCACCTT CTAAACCTAG TTCTTAAGAG 1860 

TCCAATWAAT TCTCAGTGTA AGYTTCAAAA 1920 

1935 
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(2) INFORMATION FOR SEQ ID NO: 139: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1446 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

NGCCCCCTTG GCACAAGTCA GATGAAGCAC GTTCTGCCGG GGAGGCCCTC AMCTTCCAGA 60 

15 

GAGGACAGAC ACAGATTTCC TGCTGGGGGA GGGAGGAGTC CACGCATCCT GATGCTGCCT 120 

GGAAGCTTAT TTTCCCGTGG CCAGGATGCA TTTCTCTGAG TGGAAACAGG TTCTTGCATG 180 

20 TGGATGTGTG TTTCCCCAGG CAGACGGCCC CTCTYTTCCC AGCACTTCCC TGCCTCCCCC 240 

AGGCCTCAGG CCAGCACCCA GTTCCTCCTC ACATGGCAGG TGAGCACAGA CTTCTAGTTG 300 

GCAGGAGCTG AGGAGGGTGA ACAAACCCCG AGGGAGGCCC GGCCCTTGCT CCCGAGTTGG 360 

25 

GGGGAGQGGG TGTGGCAACG TGCCCCCCGC AGAGGCCACG CATGTTTGAC CAAAGCCCTC 420 

ATTGTGGTCC GAGGACAGCC TTTTCCCCAG GCCTCARAGC ATTGCTCATC CGTGCCAAAC 480 

30 TGGGTAGGTG GATTTGAGCG GAAAGACTCC CAAAATGTGC CAAGAATTTC CCRGTCCCAG 540 

GCAGGGCAGG GGAAACTAAG GGCAAGCAGG ATACAGGGCG AGGGATGTGG CAGGTGAGGG 600 

GGCTCCCGCC TGTGCCCCTT CTCCTCACCA TGTCTCCCCC ACCCTGCCTC AGTTCTCCGT 660 

35 

TCCCCTTCAT CTCCGTCCCC CTCTTTGAAG CTGTCCCCAT CTCAGTGTCA GACCAGCCTT 720 

CTCCTCAKCT GACCACCCTC CTCTGACCSA CGCCCCCTCC TTGTCTGAAA AAAGGAGCCT 780 

40 TGAATGGTGG AGGGAGGCAG TGGGGAGAAA GGTCTCACCG GAGAGGTTGG GAGAATGAGG 840 

TCAGCGGTGC TGGGGAACAG ATGGAGGGGG CAGTGGGGAC AGGGCTTGGG CAGACACCAG 900 

CAGGAATAAT TTGAAATGTG TGAGGTGACT CCCCGGAGGC CTTGGGCTTG GGCATTTGGG 960 

45 

AAAAGAATGA TGTCTGGAAG GGCTTAAGGG ACACAGTGGA CGAGGGGAGA GTCCTCATCT 1020 

GCTGGCATTT TGTGGGGTGT TAGTGCCAAA CTTGAATAGG GGCTGGGGTG CTGTCTTCCA 1080 

50 CTGACACCCA AATCCAGAAT CCCTGGTCTT GAGTCCCCAG AACTTTGCCT CrTGACTGTC 1140 

CCTTCTCTTC CTACCTCCAT CCATGGAAAA TTAGTTATTT TCTGATCCTT TCCCCTGCCT 1200 

GGTCTAGCTC CTCTCCAAAC AGCCATGCCC TCCAAATGCT AGAGACCTGG GCCCTGAACC 1260 

55 

CTGTAGACAG ATGCCCTCAG AATTGGGGCA TGGGAGGGGG GSTGGGGGAC CCCATGATTC 1320 

AGCCACGGAC TCCAATGCCC AGCTCCTCTC CCCAAAACAA TCCCGACAAT CCCTTATCCC 1380 

60 TACCCCAACC CTTTGCGGCT CTGTACACAT TTTTAAACCT GGCAAAAGAT GAAGAGAATA 1440 
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TTGTAA 



1446 



(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1109 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

TTTTTTTTTT TTTGATATGA AATTGTCTTT CTCCATTGCA GAAATAAGCT AGGGAAACAC 60 

TAACCCAAAA ACTTTCTGTA GAGCTGTTCC TTTGGAGGCA GCATCACTTA TTGGCAGTAA 120 

20 

AGACTCAGTA TAAAAGCACC AGCATCCCTA CTTGGGTGAT GGGGATTAAT TTTATAGCAT 180 

TCCATTTTCC TAGTGCCACA TGTGAAATTG GATTTTGATG ATCTTAATCT ATATTCTACC 240 

25 CTTATAATAA AAGATCAAAA GATATATCTC CTATGAACAG ATTGGAGATA GGAGATGAAA 300 

AGTTGGGAGG ATGTCTTTAT TCTAATGTGA GGGTAGGGAA AATGTGGATA ACATTACTGG 360 

GGTGARGGAG GCATTGTTCT TTAGTTGGAG TTCTCATTTT TATTCTCCAG TACTGACTTG 420 

TGGGGAAAGC ATACTTTTTC ACTGCCAGGT ACTGAATGCA GAGGCTCAGT GAAGTATATA 480 

TGTGGGAAGT GCATGCATTT CGTTTATTAG CAAACATAGC TGGATTAAGA CAAAGTTGTT S40 

35 GGTTTGGAAA GGGGTTAAAG CCTTAAGTGA ACAAATCTAG CTAACAGTGA ATGAACTAGG 600 

TAATATAACT TGCATATTTT TAATTTCCTT TGGTTAAAGG TCCCCCATAC TTCTCTGTTC 660 

GGAGACATGA GAAGTATGAT TACTTCAGTG TTAGTTTTCT TAATTTTTTT TTTCCCCTAT 720 

TTGTCCCTTG TCACTTTGTT GCAAGCTAGA AATCTGTGGG TTATACATAG GGCAGCTCTT 780 

TGTGAAAGTG GTTTATTCCA CTGGA3AAAG GGGATTGAAA ATCAGTTAGA ACCAATGTAT 840 

45 TTCTTGCCCC ACGGAACACT ATTCCTATAA GATAGCTGAA AGAAGCTGCT GTGAGGAGCT 900 

CAGCTCCAAA CACAGGATCA GCACCTTGTA TAGGAATTCC CATGAATTAT GACTTCTCAT 960 

TCTGTTTTAT CAGAGTGCAT ATATGTCCTA CTTCAGGAAA AGTAAAACAG TCATTTACGA 1020 

AAGAAAGTCA ATCTGTATCC TAAGCATTTT AATAAAAAGT TAAAACAAAA AATTAAAAGG 1080 

GACACTCGAG GGGGGGCCCG AAACCCAAT 1109 



30 



40 



50 



55 



<2) INFORMATION FOR SEQ ID NO: 141: 
60 (i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 497 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 
TAGGACTAAC TTAAATTCTT TTATTCATCT TTTATTTATT AAAAAATTTT ATTTCTTTGA 60 
10 ATTTTCCTGT AATTTCCTTA RGCTCTTCTA TAAAATGTTA TATTCATGTG AACCATACCT 120 
CATTATCCTT AACATTTACT CTCAAAAAGC TTTTTATTTT TATTTTTTTG AAGGTAGTTT 180 
TTCTGTGTGT ACTCTGTAAC ATGATTTTGC TTTCAAATCA TTGTTGTGCC CCCATACAAA 240 

15 

ATGCCTTTTA TTTTTGAGGA TCGTGGACTT TTTAGTATGG CATGAGTGTG CTAAAAGCCA 300 
GATATCTTTC CACATTCACT GGTGGCTTTG ACACCTAGTT TTTAATCTCC CATCCTTACT 360 
20 TTAAACCCTG ACAGTGCAGT CCTCAGTCAG GGC CAGGACC GGGCTGAGGC CCTTTGTGGA 420 
GATGCTGCAC CACCAGCAGA AGGCTGAGAC CTGGTTACCT GTACCTGTTC ACTTGTAATA 480 
AAAAGAATTA TCTAAAA 497 

25 



(2) INFORMATION FOR SEQ ID NO: 142: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 269 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
35 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

ATGAGGCAGA GGCAAGCTGC CTGCCAACCC CCTCCCTCAA GGAATGGCCT TGCCCAGGAA 60 

40 

TGCCCACCAC ACATACCCTC TTCTTTTTTT CTAGTCAAAC TCTTGTTTAT TCCTTGGCTT 120 

GCCTCCCTCC TTTCCTCCCC TCTCAACCTT TTACTTCTGG TTTCTATTTC ATGGGATTTG 180 

45 GGGTTGAAGT TAAACTTACA ACAGTGCCGC CAACACCAAG TCTTGCAGGA AAAAAATACA 240 

AAGAAATTTA ACAAAAAAAA AAAAAAAAA 269 



50 

(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS : 
55 (A) LENGTH: 1269 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 
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TTGATTGACT ATGGTCTCTC CGGCTACCAG GAAGAGTCTG CCGAAGTGAA GGCCATGGAC 60 
TTCATCACCT CCACAGCCAT CCTGCCCCTG CTGTTCGGCT GCCTGGGCGT CTTCGGCCTC 120 

5 

TTCCGGCTGC TGCAGTGGGT GCGCGGGAAG GCCTACCTGC GGAATGCTGT GGTGGTGATC 180 
ACAGGCGCCA CCTCAGGGCT GGGCAAAGAA TGTGCAAAAG TCTTCTATGC TGCGGGTGCT 240 
10 AAACTGGTGC TCTGTGGCCG GAATGGTGGG GCCCTAGAAG AGCTCATCAG AGAACTCACC 300 
GCTTCTCATG CCACCAAGGT GCAGACACAC AAGCCTTACT TGGTGACCTT CGACCTCACA 360 
GACTCTGGGG CCATAGTTGC AGCAGCAGCT GAGATCCTGC AGTGCTTTGG CTATGTCGAC 420 

15 

ATACTTGTCA ACAATGCTGG GATCAGCTAC CGTGGTACCA TCATGGACAC CACAGTGGAT 480 
GTGGACAAGA GGGTCATGGA GACAAACTAC TTTGGCCCAG TTGCTCTAAC GAAAGCACTC 540 
20 CTGCCCTCCA TGATCAAGAG GAGGCAAGGC CACATTGTCG CCATCAGCAG CATCCAGGGC 
AAGATGAGCA TTCCTTTTCG ATCAGCATAT GCAGCCTCCA AGCACGCAAC CCAGGCTTTC 
TTTGACTGTC TGCGTGCCGA GATGGAACAG TATGAAATTG AGGTGACCGT CATCAGCCCC 

25 

GGCTACATCC ACACCAACCT CTCTGTAAAT GCCATCACCG CGGATGGATC TAGGTATGGA 780 
GTTATGGACA CCACCACAGC CCAGGGCCGA AGCCCTGTGG AGGTGGCCCA GGATGTTCTT 840 
30 GCTGCTGTGG GGAAGAAGAA GAAAGATGTG ATCCTGGCTG ACTTACTGCC TTCCTTGGCT 
GTTTATCTTC GAACTCTGGC TCCTGGGCTC TTCTTCAGCC TCATGCCTCC AGGGCCAGAA 
AAGAGCGGAA ATCCAAGAAC TCCTAGTACT CTGACCAGCC AGGGCCAGGG CAGAGAAGCA 

35 

GCACTCTTAG GCTTGCTTAC TCTACAAGGG ACAGTTGCAT TTGTTGAGAC TTTAATGGAG 
ATTTGTCTCA CAAGTGGGAA AGACTGAAGA AACACATCTC GTGCAGATCT GCTGGCAGAG 1140 
40 GACAATCAAA AACGACAACA AGCTTCTTCC CAGGGTGAGG GGAAACACTT AAGGAATAAA 1200 
TATGGAGCTG GGGTTTAACA CTAAAAACTA GAAATAAACA TCTCAAACAG TAAAAAAAAA 1260 
AAAAAAAAC 

45 



600 
660 
720 



900 
960 
1020 
1080 



1269 



(2) INFORMATION FOR SEQ ID NO: 144: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1944 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
55 (D) TOPOLOGY: linear 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 144: 
AAAAGGCAAA CTATAGGATA ACACAGAGCC CTTTTTGAAA ATAAATTGGC ATTGGAGTGT 60 
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TTTACCCTCT AGCTGTTTTA CTTAGAATGT 
CTGTACTGCA AGAGGGCCCT GGGCCTCTGC 
5 GTCCCAAAGA AGAGCATGGG TGGCAGATGG 
ATGGAGCACA AGGGGTCACA GCATGCCTCC 
AGAACATGGT CTTCTTGCCA CGGGGTGTTG 

10 

TCACCTTTAT TCTTGAAACT GAGGTTTACC 
GCAGAATGGG GTTGGGCCTG TGGCCCCCAA 
15 CCTTTTGTCT CCTAAAGATA GGGATCTACT 
TTGCTTTACC TTGGTCCTTT CTTTTGTGCC 
TCACATTTGG CCAAACCTGA CACTGTCTTG 

20 

AGAATTCAGG ATAGCCCTTC CTAGGGCACT 
CAAGTTATTT TCATGTTACC TGGAGAGTGT 
25 CCCCCTTGCC TGGTTCCAGC TGTCAGAGGG 
CACGAGACTC CTTGGTTTGT GGTCCGAGAT 
CAGACCAGCT TTTGCACAAT GAAGCGCAAG 

30 

GTCCTGAACC TGGTCCTGTG GGCCATTGAA 
TGGCTTTGTT CAATGCTTCC ACTCTAGGGC 
35 GCTTGACCTC CAAGTAGAGC TGATACAGAG 
GTATTCATAC ATTTCAGCTG CAAGTCAGCA 
AGTCATTCTT AAAGACAGAG GATAGCTGTG 

40 

TGCAGGTTCC CTTTTTCCTT CCTCAGGTTT 
AGAGACTGTG GGGTGGATTG GGAGAACAGA 
45 AACAGTGGGG AGCTAACTGT GAATGAGGAG 
GTGAATGCTG TATTGGCACA GGGAATAAAT 
TCAAGTCCTT CCTGTGATAC TGCCATGGCA 

50 

ATCACACCCT GGGCATTGTC TGGGCTGCAG 
GTGGCCCTGG ATGCTGGAGC TGGAGGGTTT 
55 GGCCTGTGTA GAGCCCCCTC CTGTGCCCTC 
AGATGGGAAA GGTCAGGCAG AATTTTTCTG 
GTATTTTCAT GAATTTACCA TATATCTTTG 

60 



AACATATGCT GCCTACCCAC CTCAAAATGT 120 

TTTCCATATT CACGTTTGGC CAGAGTTGTA 180 

TAGGGAATTG AACTGGCCTG TGCAATGGGC 240 

TGCCTTACCG TGGCAGTACG GAGACAGTCC 300 

TTGTCTCTGG TGGTGCTGCA TGTCTGTGGC 360 

TGGATCTGGC TACTGAGGCT AGAGCCCACA 420 

ACTAGGGGGT GTGGGTTCAT CACAGTGTTG 480 

TTTGAAGGGA ATTGTTCCTC CCAAATAAAT 540 

AGTATTCAAG TGGTATAGCT CTGAGCAGGG 600 

CTGCATTCTC CTTTGGCAAA CATCAGGGTC 660 

GGACTTTCTG GCATGGGGGC TGTGTTTGCA 720 

CCAGAGGCTG CTCTGAGGCT GAGGTGTGTT 780 

ATACCATCCT AGGGTCTGGG AATCCAAGGC 840 

CCTGTACTAA GGAGGGTCTG GCCAGAGGAA 900 

GGAACAAGTG GTTTGCCTGG TGTCCTACCT 960 

AAGTTAGATC TGTGATCTCT GGGGTTTTTG 1020 

AGGCAGAGCA GTCTATACTC TCCCAAGCCT 1080 

ATCTGTGAAT ATTGTGATAG AAATTCTTTG 1140 

ATTTCCCAGG TACCATGTAA GCTATAAAAC 1200 

ACTCATGGGA TCATGAGGTC CATGGCTGGT 1260 

TGTCTCTTCC TGTGTTGTCC CCAGCAAGGG 1320 

TTAGGAGTAT AGCAAATGAA CCCAGAATGG 1380 

AGTACCTGCT GCAGGACCTG GAGGTCAGGT 1440 

ATCCTGGCGT CTGGAGCCTT CACCTCTCCG 1500 

CAGGATCTGA GTTGCAGCTC TGCACCCTAA 1560 

GGCTGCCAGG TTCTGTACTT GTGTCCAGCT 1620 

TCTGTGCTCA GACTGTAGCC TGTAGCTCTT 1680 

AGTGGCTGTC GTTTGTTAAC ATCATCAGGA 1740 

CCCTACAAAG GGTGGAAGAG AAAGGACACA 1800 

TTTTTCTTCA ACGAAAAAGT TAATTGAGGC 1860 
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AATGTCATCT GCTCAAAGTT GAGTGGTTTA TTCACAATAA ACTGTAAGTT TCTGATTATA 1920 
AAAAAAAAAA AAAAAAAAAA AAAG 1944 

5 

(2) INFORMATION FOR SEQ ID NO: 145: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 
TCGACCCACG CGTCCGGGGT GCGCAACGGG GAGOTCCGGC TGGAGACCCG TGCTCTGGGC 60 
20 CGGCGCCTTC ACCATGGCCT CGGCAGAGCT GGACTACACC ATCGAGATCC CGGATCAGCC 120 
CTGCTGGAGC CAGAAGAACA GCCCCAGCCC AGGTGGGAAG GAGGCAGAAA CTCGGCAGCC 180 
TGTGGTGATT CTYTTGGGCT GGGGTGGCTG CAAGGACAAG AACCTTGCCA AGTACAGTGC 240 
CATCTACCAC AAAAGGGGCT GCATCGTAAT CCGATACACA GCCCCGTGGC ACATGGTCTT 300 
CTTCTCCGAG TCACTGGGTA TCCCTTCACT TCGTGTTTTG GCCCAGAAGC TGCTCGAGCT 360 
30 GCTCTTTGAT TATGAGATTG AGAAGGAGCC CCTGCTCTTC CATGTCTTCA GCAACGGTGG 420 
CGTCATGCTG TACCGCTACG TGCTGGAGCT CCTGCAGACC CGTCGCTTCT GCCGCCTGCG 



25 



35 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 146: 



(i) SEQUENCE CHARACTERISTICS: 
60 (A} LENGTH: 1285 base pairs 



480 



TGTGGTGGGC ACCATCTTTG ACAGCGCTCC TGGTGACAGC AACCTGGTAG GGGCTCTGCG 540 



600 



GGCCCTGGCA GCCATCCTGG AGCGCCGGGC CGCCATGCTG CGCCTGTTGC TGCTGGTGGC 
CTTTGCCCTG GTGGTCGTCC TGTTCCACGT CCTGCTTGCT CCCATCACAG CCOTCTTCCA 660 
40 CACCCACTTC TATGACAGGC TACAGGACGC GGGCTCTCGC TGGCCCGAGC TCTACCTCTA 720 
CTCGAGGGCT GACGAAGTAG TCCTGGCCAG AGACATAGAA CGCATGGTGG AGGCACGCCT 
GGCACGCCGG GTCCTGGCGC GTTCTGTGGA TTTCGTGTCA TCTGCACACG TCAGCCACCT 
CCGTGACTAC CCTACTTACT ACACAAGCCT CTGTGTCGAC TTCATGCGCA ACTGCGTCCG 
CTGCTGAGGC CATTGCTCCA TCTCACCTCT GCTCCAGAAA TAAATGCCTG ACACCTCCCC 960 
50 ACAAAAAAAA AAAAAAAAAA ACTCGAGGGG GGGCCCGGTA CCCAATTCGC CCTATAAAGG 1020 
T 



780 
840 
900 



1021 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
ID) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

GGCACGAGGA GGGCCACGGC AGCCATCGCG CTTTGCAGTT CGGTCTCCTG GTGTACGGCC 60 

AACGCCAAGT AGGGGATTGC GTTCCCTCCA GTCGCAGACC CTATCAGATT TGGATATGTC 120 

CTTCATATTT GATTGGATTT ACAGTGGTTT CAGCAGTGTG CTACAGTTTT TAGGATTATA 180 

TAAGAAAACT GGTAAACTGG TATTTCTTGG ATTGGATAAT GCAGGAAAAA CAACATTGCT 240 

15 ACACATGCTA AAAGATGACA GACTTGGACA ACATGTCCCA ACATTACATC CCACTTCCGA 300 

AGAACTGACC ATTGCTGGCA TGACGTTTAC AACTTTTGAT CTGGGTGGAC ATGTTCAAGC 360 

TCGAAGAGTG TGGAAAAACT ACCTTCCTGC TATCAATGGC ATTGTATTTC TGGTGGATTG 420 

20 

TGCAGACCAC GAAAGGCTGT TAGAGTCAAA AGAAGAACTT GATTCACTAA TGACAGATGA 480 

AACCATTGCT AATGTGCCTA TACTGATTCT TGGGAATAAG ATCGACAGAC CTGAAGCCAT 540 

25 CAGTGAAGAG AGGTTGCGAG AGATGTTTGG TTTATATGGT CAGACAACAG GAAAGGGGAG 600 

TATATCTCTG AAAGAACTGA ATGCCCGACC CTTAGAAGTT TTCATGTGTA GTGTGCTCAA 660 

AAGACAAGGT TACGGAGAAG GCTTCCGCTG GATGGCACAG TACATTGATT AACACAAACT 720 
CACATTGGTT CCAGGTCTCA ACGTTCAGGC TTACTCAGAG ATTTGATTGC TCAACATGCA 



30 



40 



780 



TAACTTGAAT TCAATAGACT TTTGCTGGTT ATAAAACAGA TGTTTTTTAG ATTATTAATA 840 



35 TTAAATCAAC TTAATTTGAA TGAGAATTGA AAACTGATTC AAGTAAGTTT GAGTATCACA 900 

ATGTTAGCTT TCTAATTCCA TAAAAGTACT TGGTTTTTAC AGTTTATAAT CTGACATCAC 960 

CCCAGCGCCA TTTGTAAAGA GCAACTTTCC AGCAGTACAT TTGAAGCACT TTTTAACAAC 1020 

ATGAAACTAT AAACCATATT TAAAAGCTCA TCATGTTAAA TTTTTTATGT ACTTTTCTGG 1080 

AACTAGTTTT TAAATTTTAG ATTATATGTC CACCTATCKT AAGTGTACAG TTAATAATTA 1140 

45 GCTTATTCAA TGATTGCATG ATGCCTTACA GTTTTCAATA ACTTTTTTTC TTATGCAAAC 1200 

GTCATGCAAT AAAACAAACT CTAATGTTTG GCAAAAAAAA AAAAAAAAAA NTCGAGGGGG 1260 

GGCCCGTACC CAATTCGCCC TAAAG 1285 

50 



(2) INFORMATION FOR SEQ ID NO: 147: 

55 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1386 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
60 (D) TOPOLOGY : linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

GGCACGAGGT GGCGCAGGGG TCAGTGGTTC TCTCGGGTCT CGGGAGAGGT GAGCACCCTG 60 

5 

ATGAAGGCCA CGGTCCTGAT GCGGCACCTG GGCGGGTGCA GGAGATCGTG GGCGCCCTCC 120 

GCAAGGGCGS CGGAGACCGG TTACAGGTGA TTTCTGATTT TRACATGACC TTGAGCAGGT 180 

10 TTGCATATAA TGGAAAGCGA TGCCCTTCTT CTTACAATAT TCTGGATAAT AGCAAGATCA 240 

TCAGTGAGGA GTGTCGGAAA GAGCTCACAG CGCTCCTTCA CCACTATTAC CCAATTGAGA 300 

TCGACCCACA CCGGACCGTC AAGGAGAAGC TACCTCATAT GGTGGAATGG TGGACCAAAG 360 

15 

CGCACAATCT CCTATGTCAG CAGAAGATTC AGAAGTTTCA GATAGCCCAG GTGGTTAGAG 420 

AGTCCAATGC AATGCTCAGG GAGGGATATA AGACCTTCTT CAACACACTC TACCATAACA 480 

20 ACATTCCCCT TTTCATCTTT TCTGCGGGCA TTGGTGATAT CCTGGAAGAA ATTATCCGAC 540 

AGATGAAAGT GTTCCACCCC AACATCCACA TCGTGTCTAA CTACATGGAT TTTAATGAAG 600 

ATGGTTTTCT CCAGGGATTT AAGGGCCAGC TGATACACAC ATACAACAAG AACAGCTCTG 660 

25 

TGTGTGAGAA CTSTGGTTAC TTCCAGCAAC TTGAGGGCAA AACCAATGTC ATCCTGCTGG 720 

GAGACTCTAT CGGGGACCTC ACCATGGCCG ATGGGGTTCC TGGTGTGCAG AACATTCTCA 780 

30 AAATTGGCTT CCTGAATGAC AAGGTGGAGG AGCGGCGGGA NCGCTACATG GACTCCTATG 840 

ACATCGTGCT GGAGAAGGAC GAGACTCTGG ATGTGGTCAA CGGGCTACTG CAGCACATCC 900 

TGTGCCAGGG GGTCCAGCTG GAGATGCAAG GCCCCTGAAG GCGCAGGCTN CCAGNCCGCC 960 

35 

TGCAGGCCGT GGTGAGGAGG GGCGCCTCCC CAGAGTCTGC TCCCCCGTGA ACACAGAGCA 1020 

GANGCCAGGG TGGCCAGCAG TGGCTGGGTC CTTCCGCGCC CCTCCGTCCT CCTTTCCCTG 1080 

40 AGCACCTTCA TCACCAGAGG CTTGAAGGAA CCCCGCCATG TGGCAGGGCA CAGGCACTGT 1140 

TCCTGGTGAA CCTTGGACCA CAGCATGTCA GTGCTCTAGG GATTGTCTAC TCCAGGGATT 1200 

TTCTTCAAAA TTTTTAAACA TGGGAAGTTC AAACAAATAT AATGTGTGAA ACAGATCAAA 1260 

45 

ATTTTTAAAA TGAAAAAAAA GCTGCTCTGA TTCAGGGGAT GTGGGTCGGG GTAGAACCTG 1320 

GACCTCTTGG CCTGGGGGCA CATGGGATGC TTCTAGGAAC ACAGTTTGAG AACCACCAAA 1380 

50 AAAAAA 1386 



55 (2) INFORMATION FOR SEQ ID NO: 148: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LEZsJGTH : 2098 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEENESS: double 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 

AGCCCTTCTC CCCGCGCTTG GGACTCTGAC ATCTTAAGGC TGCACGGTCG TGTCCTTGTC 60 

TGGGTGAGGC CATGTCTGTG ATCCAAGGTT CCTGGAACTG ACACAGGAAG GGGCTGTGAA 120 

CCCTAAGTGG GTGTMATCTC CTCCRACCGA GGCTTCTMAC CCTGGAGATG GCAGTTACTC 180 

CTGGCCATGG TTGCTGAGCA TGGGCAGACC AGTGGAGGCC ACCCTACTGT GTTATCTGCG 240 

CCTTCRATGA AGTGAGACCC TTGGGGAGAA CGGGCTGTGG ATGAAGGAGT GGACTGCAGC 300 

15 CTTGGCCTAG CCACTGGGCT GGGATCTTCT GGGTCATGTG ACTGTGTATC CAGGAGCAGA 360 

AACTTGTATT CTCAGGATTC AGGATCTACC CAGCACCAAA GATGTATTTT CAGGAGAACA 420 

GACCTAGAAA TGGGCCTGTC TGGCATTTCA GAGTCAQGCA AAGCAGGCAG GGCCAGGGAG 480 

20 

CTTCTGTGGG TCTACACAAG AAGGTTCCTG TGAGGGCTAT CAGTTGTTGC CTTCTAGCTT 540 

GCTGGTAACT TTGGCGCCTC CGCCAAGCCC TGCCAGACTC CCCTGGCTGT GATGGCATTC 600 

25 TGTGCCATCC TGCCTTGTCC CCAGCCTCTG CAGGATGCCC TCCCTACCCA MCTYTYCCTG 660 

GGCCTTCCCT GTCCACTGGG CTGGATTCAT GTTCAAACCA CTGGACTGGC AGGGCAACGA 720 

CITCITCCCA CCTCAAGATG AGGTCCTCGC CCCCTTGTCT TGGCATAAAA ACACCTTTAA 780 

30 

AGCATGAGCC ATGTGCTTCT TTGCCCTTCT CTGTCCTGTT CCAATCTTCT GCCTCCCAGT 840 

CACTCCCTGG GGACTATGGG ATCACTGTCC CCCCACCTGT GTGGCCACAC CATGTGTCCT 900 

35 GTCAATCCAG AACTGCCTCT GAGCTCCAGG CTGACCACAG ATCAGCCACA GCCTGATGCC 960 

TGCAGCCCCA CTTTGCTCAC CCTTCCCCTC CCCTCCTCCT TCCTTCCACA CAGCAAGCCT 1020 

ACCTTTYTCC ATCCATGCTC ACCATAGCCC CCTTCCTTGT GACCTGGACC CTCCATTGTA 1080 

40 

CCTGGCTGAG ACTGTCAGCC TCCTGGAGGA GTGGGGTCCA CCTTCTTCTT GCCCTATGCA 1140 

GTGCAAGCTT CACTTCTCAC CCAGCAAGGT TGACTCATCT GCCTCCATGT CTCTGGGGCT 1200 

45 TTGCTSTTGC CCTGAAACCT AGCTGGGCTG GTCTTGCTCC CAGCTTGCTT CCCCCTCCTC 1260 

GGATGTCCCT TTGCAGGCCC CTGTCGTTCC TCCGGCACCA GTGTCCTTGG CTGCCATGGC 1320 

AAGCTCATCA GGGGCTTGTA CCCTGGTCAC CAAGCATGGT AGCAGCTGCC TGCATTGTAT 1380 

50 

CTCCATCTGG TCACTGCAGG TGCCAACCCT TCATCCCCCA TGTTTTCCTG GGCCATGGAG 1440 

GGCTGACCTC CGTTTCTGGG GAATGTGGCT GAGCTGTGGT AACCAGCTAC ACCCCAGGTG 1500 

55 CTCTTTCCAT GGTGGTGCCT GCTCATCTTG CTGATGCAAA CTAGGAAGTT AGGCTGCATC 1560 

TCGGAGTGGC TTTCGCTGGA GAGGTGCTTT QCTGTCTCTC AGACTCAGTC ACTGTGTTCC 1620 

CTCCCCGCCT CTCTTATCTC CATGGCTGTT TGCAGCTCTC CCAGGTACTT TGGGGTCTGA 1680 

60 
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15 



GCTGGAATTC CTTTGTGGTT TGCTCTTCTG CTTCTCACTC TTGTATTAAG AAGGATTCCA 1740 

CAAAGGGAGA GTGGCATCCC TGCTGCTGCT GTGCCAGACC AGAGTTTCCT GAGGGGCCCT 1800 

GACCCTAACC CTCCAGCTCA GCCCTGTACA CCTGACCCTG TAAATGAGTG GGGTTTGCTG 1860 

ACTGTAATCC CTGACACCAG TAAAACCAAA AGGACTCTTG GGGGCTCAGT GTGAGAGCCA 1920 

GGGTTACCTA CTCTGCCAAG TGAGGACAAA CTGCTAGGCT GTATCCCATA ATTTCAGGAT 1980 

GAGAAACATT AACAATAAAA ATTTGTAGTA AACATAACCT CATGANGACT AAAAAAAAAA 2040 

AAAAACTYGG GGGGGGCCCQ GTAACCCATT GGGCCCTTNG GGGGGGNGTT TTAAAATT 2098 

(2) INFORMATION FOR SEQ ID NO: 149: 

20 (i> SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1847 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

TCGACCCACG CGTCCGAACT GAGGCGGCGG CGGGAGCCGG TTGGKGTCTG GTCTTCGCGT 60 

30 CGGCCCCGCG GACCAGACGC TGCCCCCGGC GCGGGGAGAA GATGGTGCCK AGCGGCCTCG 120 

GGCCCGCCAC GCGCCGCCAC GAGTGAGCCC AGCGCGACCG CGGGCGTCCG CCGAGCAGCT 180 

GGCCCGGCTG GGCCCGGGGC GCGCANTGCC CGCCGGGGCG GGGTGGAGCT GATCAGAATA 240 

35 

ATGTTCAGCA TCAACCCCCT GGAGAACCTG AAGGTGTACA TCAGCAGTCG GCCTCCCCTG 300 

GTGGTCTTCA TGATCAGCGT AANGCCCATG GCCATAGCTT TCCTGACCCT GGGCTACTTC 360 

40 TTCAAAATCA AGGAGATTAA ATCCCCAGAA ATGGCAGAGG ATTGGAATAC TTTTCTGCTA 420 

CGGTTCAATG ATTTGGACTT GTGTGTATCA GAGAATGAAA CCCTCAAGCA TCTCACAAAC 480 

GACACCACAA CTCCGGAAAG TACAATGACC AGCGGGCAGG CCCGAGCTTC CACCCAGTCC 540 

45 

CCCCAGGCCC TGGAGGACTC GGGCCCGGTG AATATCTCAG TCTCAATCAC CCTAACCCTG 600 

GACCCACTGA AACCCTTCGG AGGGTATTCC CGCAACGTCA CCCATCTGTA CTCAACCATC 660 

50 TTAGGGCATC AGATTGGACT TTCAGGCAGG GAAGCCCACG AGGAGATAAA CATCACCTTC 720 

ACCCTGCCTA CAGCGTGGAG CTCAGATGAC TGCGCCCTCC ACGGTCACTG TGAGCAGGTG 78 0 

GTATTCACAG CCTGCATGAC CCTCACGGCC AGCCCTGGGG TGTTCCCCGT CACTGTACAG 840 

CCACCGCACT GTGTTCCTGA CACGTACAGC AACGCCACGC TCTGGTACAA GATCTTCACA 9C0 

ACTGCCAGAG ATGCCAACAC AAAATACGCC CAAGATTACA ATCCTTTCTG GTGTTATAAG 960 

60 GGGGCCATTG GAAAAGTCTA TCATGCTTTA AATCCCAAGC TTACAGTGAT TGTTCCAGAT 1020 



55 
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GATGACCGTT CATTAATAAA TTTGCATCTC ATGCACACCA GTTACTTCCT CTTTGTGATG 1080 

GTGATAACAA TGTTTTGCTA TGCTGTTATC AAGGGCAGAC CTAGCAAATT GCGTCAGAGC 1140 

5 

AATCCTGAAT TTTGTCCCGA GAAGGTGGCT TTQGCTGAAG CCTAATTCCA CAGCTCCTTG 1200 

TTTTTTGAGA GAGACTGAGA GAACCATAAT CCTTGCCTGC TGAACCCAGC CTGGGCCTGG 1260 

10 ATGCTCTGTG AATACATTAT CTTGCGATGT TGGGTTATTC CAGCCAAAGA CATTTCAAGT 1320 

GCCTGTAACT GATTTGTACA TATTTATAAA AATCTATTCA GAAATTGGTC CAATAATGCA 1380 

CGTGCTTTGC CCTGGGTACA GCCAGAGCCC TTCAACCCCA CCTTGGACTT GAGGACCTAC 1440 

15 

CTGATGGGAC GTTTCCACGT GTCTCTAGAG AAGGATTCCT GGATCTAGCT GGTCACGACG 1500 

ATGTTTTCAC CAAGGTCACA GGAGCATTGC GTCGCTGATG GGGTTGAAGT TTGGTTTGGT 1560 

20 TCTTGTTTCA GCCCAATATG TAGAGAACAT TTGAAACAGT CTGCACCTTT GATACGGTAT 1620 

TGCATTTCCA AAGCCACCAA TCCATTTTGT GGATTTTATG TGTCTGTGGC TTAATAATCA 1680 

TAGTAACAAC AATAATACCT TTTTCTCCAT TTTGCTTGCA GGAAACATAC CTTAAGTTTT 1740 

TTTTGTTTTG TTTTTGTTTT TTTGTTTTTT GTTTTCCTTT ATGAAGAAAA AATAAAATAG 1800 

TCACATTTTA ATACTACCAA AAAATGGACA AAAAAAGTCG AGGGGGG 1847 



25 



30 



(2) INFORMATION FOR SEQ IU NO: 150: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1569 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 
GACGCTGACG AGAGAAGGCC TCTTCCTTGA GGGTTGGTGC TGTGTTGCAG TGACCGTGGC 60 
45 GGATTACGCC AACTCGGATC CGGCGGTCGT GAGGTCTGGA CGAGTCAAGA AAGCCGTAGC 120 
CAACGCTGTT CAGCAGGAAG TAAAATCTCT TTGTGGCTTG GAAGCCTCTC AGGTTCCTGC 180 
AGAGGAAGCT CTTTCTGGGG CTGGTGAGCC CTGTGACATC ATCGACAGCA GTGATGAGAT 240 

50 

GGATGCCCAG GAGGAAAGCA TCCATGAGAG AACTGTCTCC AGAAAAAAGA AAAGCAAGAG 300 
ACACAAAGAA GAACTGGACG GGGCTGGAGG AGAAGAGTAT CCCATGGATA TTTGGCTATT 360 
55 GCTGGCCTCC TATATCCGTC CTGAGGACAT TGTGAATTTT TCCCTGATTT GTAAGAATGC 420 
CTGGACTGTC ACTTGCACTG CTGCCTTTTG GACCAGGTTG TACCGAAGCA CTACACGCTG 480 
GATGCTTCCC TGCCTTTGCG TCTGCGACCA GAGTCAATGG AGAAGCTGCG CTGTCTCCGG 54 C 



60 
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10 



15 



20 



25 



30 



35 



CCTTGTGTGA TCCGATCTCT GTACCATATG TATGAGCCAT TTGCTGCTCG AATCTCCAAG 600 

AATCCAGCCA TTCCAGAAAG CACCCCCAGC ACA1TAAAGA ATTCCAAATG CTTACTTTTC 660 

TGGTGCAGAA AGATTGTTGG GAACAGACAG GAACCAATGT GGGAATTCAA CTTCAAGTTC 720 

AAAAAACAGT CCCCTAGGTT AAAGAGCAAG TGTACAGGAG GATTGCAGCC TCCCGTTCAG 780 

TACGAAGATG TTCATACCAA TCCAGACCAG GACTGCTGCC TACTGCAGGT CACCACCCTC 840 

AATTTCATCT TTATTCCGAT TGTCATGGGA ATGATATTTA CTCTGTTTAC TATCAATGTG 900 

AGCACGGACA TGCGGCATCA TCGAGTGAGA CTGGTGTTCC AAGATTCCCC TGTCCATGGT 960 

GGTCGGAAAC TGCGCAGTGA ACAGGGTGTG CAAGTCATCC TCGACCCAGT GCACAGCGTT 1020 

CGGCTCTTTG ACTGGTGGCA TCCTCAGTAC CCATTCTCCC TGAGAGCGTA GTTACTGCTT 1080 

CCCATCCCTT GGGGGCAGCC TCGAGTGTAG TCCATTAGTA ATCAGATTCC AGTTTGGACA 1140 

GGGTGGCTGG ATTGTATATC TCGTTAGTAA TGTACATGCT CTTCAGGTTC TAGGGCTCCT 1200 

GTTAGGGGAG GGAGAAATGT TGAATCAAGA GGGAAAACAA CTACTATGAT TTATAAACAT 1260 

ATTTTAATGT AAAAATTTGC ATTTAAAAGG AGTGGCCCTG TTTTCTGTGT TAAAACCCCA 1320 

TTTGGTGCTA TTGAGTTTGT TCTTTATTCT TTTATCCCAG TGAAAATTGT TGATCTTGCT 1380 

GTAGGGAAAA ATTAAACTCT TTGAATCTCC AAACAAGGAA GTTTCAGCAT TCCCTTATGG 1440 

ATCAGAGGAA CCTTAGAGGC CTGAAATTGT TGCTTCCAGT TTAGCTGCCC CTCAAATTCA 1500 

AGTGAATATT TTCCCTTCTC CCTTTACCCT TCTCCAGAAA TAAAGCAGGT GACAGGGTTT 1560 

CAGAATCTT 156g 



40 



45 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1540 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEEWESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

CCCACGCGTC CGGAAGGATT GACCAGTTAA CCAACATCTT AGCCCCCATG GCTGTTGGCC 60 

AGATTATGAC ATTTGGCTCC CCAGTCATCG GCTGTGGCTT TATTTCGGGA TGGAACTTGG 120 

TATCCATGTG CGTGGAGTAC GTCCTGCTCT GGAAGGTTTA CCAGAAAACC CCAGCTCTAG 180 

CTGTGAAAGC TGGTCTTAAA GAAGAGGAAA CTGAATTGAA ACAGCTGAAT TTACACAAAG 240 

ATACTGAGCC AAAACCCCTG GAGGGAACTC ATCTAATGGG TGTGAAAGAC TCTAACATCC 300 

ATGAGCTTGA ACATGAGCAA GAGCCTACTT GTCCCTCCCA GATGGCTGAG CCCITCCGTA 360 
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CCTTCCGAGA TGGATGGGTC TCCTACTACA ACCAGCCTGT GTTTCTGGCT GGCATGGGTC 420 

TTGCTTTCCT TTATATGACT GTCCTGGGCT TTGACTGCAT CACCACAGGG TACGCCTACA 480 

5 

CTCAGGGACT GAGTGGGTTC CATCCTCAGT ATTTTGATGG GAGCATCAGC TATAACTGGA 540 

ATAATGGGAA CTGTAGCTTT TACTTGGCTA CGTCGAAAAT GTGGTTTGGT TCGGCAGGTC 600 

10 TGATCTCAGG ATTGGCACAG CTTTCCTGTT TGATCTTGTG TGTGATCTCT GTATTCATGC 660 

CTGGAAGCCC CCTGGACTTG TCCGTTTCTC CTTTTGAAGA TATCCGATCA AGGTTCATTC 720 

AAGGAGAGTC AATTACACCT ACCAAGATAC CTGAAATTAC AACTGAAATA TACATGTCTA 780 

15 

ATGGGTCTAA TTCTGCTAAT ATTGTCCCGG AGACAAGTCC TGAATCTGTG CCCATAATCT 840 

CTGTCAGTCT GCTGTTTGCA GGCGTCATTG CTGCTAGAAT CGGTCTTTCG TCCTTTGATT 900 

20 TAACTGTGAC ACAGTTGCTG CAAGAAAATG TAATTGAATC TGAAAGAGGC ATTATAAATG 960 

GTGTACAGAA CTCCATGAAC TATCTTCTTG ATCTTCTGCA TTTCATCATG GTCATCCTGG 1020 

CTCCAAATCC TGAAGCTTTT GGCTTGCTCG TATTGATTTC AGTCTCCTTT GTGGCAATGG 1080 

25 

GCCACATTAT GTATTTCCGA TTTGCCCAAA ATACTCTGGG AAACAAGCTC TTTGCTTGCG 1140 

GTCCTGATGC AAAAGAAGTT AGGAAGGAAA ATCAAGCAAA TACATCTGTT GTTTGAGACA 1200 

30 GTTTAACTGT TGCTATCCTG TTACTAGATT ATATAGAGCA CATGTGCTTA TTTTGTACTG 1260 

CAGAATTCCA ATAAATGGCT GGGTGTTTTG CTCTGTTTTT ACCACAGCTG TGCCTTGAGA 1320 

ACTAAAAGCT GTTTAGGAAA CCTAAGTCAG CAGAAATTAA CTGGATTAAT TTCCCTTATG 1380 

35 

TTGAGGGCCA TGGRAAAAAA ATTGGGAAAA GGAAAAACTC AGTTTTAAAT ACGGGAGACT 1440 

ATAATGGATA ACACTGRATT CCCCTATTTC TCATGAGTAG ATACAATCTT ACGTAAAAGA 1500 

40 GTGGTTAGTC ACGTGAATTC AGTTATCATT TGACAGATTC 1540 



45 (2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1719 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 
55 TACTTATGAG GTCAATTGGA AATAAGAACA CCATTTTACT GGGTCTAGGA TTTCAAATAT 60 
TACAGTTGGC ATGGTATGGC TTTGGTTCAG AACCTTGGAT GATGTGGGCT GCTGGGGCAG 120 
TAGCAGCCAT GTCTAGCATC ACCTTTCCTG CTGTCAGTGC ACTTGTTTCA CGAACTGCTG 180 

60 
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ATGCTGATCA ACAGGGTGTC GTTCAAGGAA TGATAACAGG AATTCGAGGA TTATGCAATG 240 

GTCTGGGACC GGCCCTCTAT GGATTCATTT TCTACATATT CCATGTGGAA CTTAAAGAAC 300 

5 TGCCAATAAC AGGAACAGAC TTGGGAACAA ACACAAGCCC TCAGCACCAC TTTGAACAGA 360 

ATTCCATCAT CCCTGGCCCT CCCTTCCTAT TTGGAGCCTG TTCAGTACTG CTGGCTCTGC 420 

TTGTTGCCTT GTTTATTCCG GAACATACCA ATTTAAGCTT AAGGTCCAGC AGTTGGAGAA 480 

10 

AGCACTGTGG CAGTCACAGC CATCCTCATA ATACACAAGC GCCAGGAGAG GCCAAAGAAC 540 

C7TTACTCCA GGACACAAAT GTGTGACGAC TGAAATCAGG AAGATTTTTC TATCAGCACC 600 

15 CAGGTCTTAG TTTTCACCTC TAGTTCTGGA TGTACATTCC ATTTCCATCC ACAGTGTACT 660 

TTAAGATTGT CTTAAGAAAT GTATCTGCAT GAACTCCGTG GGAACTAAAG GAAGTGGGAA 720 

CTTAGAACCA GACAGTTTTC CAAAGATGTT ACAATTTCTT TTGAAAAACC TTTTG1TTAT 780 

20 

TAGCACCAAT TTCTYGCCAC TAAGCTATTT GTTTTATTAT ACATCCTTTA ATTAAAAACT 840 

ATATATGTAA CTTCTTAGAT A'lTAGGAAAT GTCTCTGCTA CCATTTCCTT AAGGTGTTGA 900 

25 GCTTTAACTC TATGCTGACT CAGTGAGACA CAGTAGGTAG TATGGTTGTG GACCTATTTG 960 

TTTTAACATT GTAAAATTTT GAGTCAGATT TTAATATTGT AAAATCTTGG GTCAAATAAT 1020 

TCAAAGCCTT AATGCAGATG CACTAAAACA AAGAAATGGT AAATGAATTG TTTGCATTTA 1080 

30 

AAAAAAAAAA CTCTTAAGAA AACTGTACTA AATCTGAATC ATGTTTTGAG CTTGTTTGCA 1140 

GTACTTTTAA ACATTATTCA CTACTGTTTT TGAAGTGAGA AAGTATCAGC CATTTAGCAT 1200 

35 TTAAGTTGGG GTATTTAGAG CCTGTAATCT AAATGCTGGC TCAAATTTAT TCCCCAGCTA 1260 

CTTCTTATAC CACTATTCTT TTAATGTTTG CATAATCATA AGCACCTCAA CACTTGAATA 1320 

CATAATCTAA AAATTATATA GTAAAGCTGG TAGCCTTGAA AATGTCAGTG TGATATCTAT 1380 

40 

TATGTAGATA AATATATATA GTGGCCTTTC AQGACTGTCA CAGTAACACT TTATTTACAG 1440 

AGCTAATGTT TGTCCTAAAT TTTCAGGACC CTAGAGGAGA GCTTTATACA ATTACCGATG 1500 

45 TGAATTTCTC TAAAGTGTAT ATTTTTGTGT CCAGTTATAT TATTTAAAAA AGTGTTACTT 1560 

TGTAAAAATT GTATATAAAG AACTGTATAG TTTACACTGT TTTCATCTTG TGTGTGGTTA 1620 

TTGCTTAATG CTTTTTAAAC TTGGAACACT CACTATGGTT AAATAAQGTC TTAAAAGAAA 1680 

50 

TGTAAATATT YTGTTAATAA AGTTAAATAT TTTAATGAT 1719 

55 

(2) INFORMATION FOR SEQ ID NO: 153: 



60 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 863 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

5 

GGCACGAGGG AAGCCGGGAC GATGTCCGCA TGACAACCGA CGTTGGAGTT TGGAGGTGCT 60 
TGCCTTAGAG CAAGGGAAAC AGCTCTCATT CAAAGGAACT AGAAGCCTCT CCCTCAGTGG 120 



10 TAGGGAGACA GCCAGGAGCG GTTTTCTGGG AACTGTGGGA TGTGCCCTTG GGGGCCCGAG 180 



AAAACAGAAG GAAGATGCTC CAGACCAGTA ACTACAGCCT GGTGCTCTCT CTGCAGTTCC 240 



TGCTGCTGTC CTATGACCTC TTTGTCAATT CCTTCTCAGA ACTGCTCCAA AAGACTCCTG 300 

15 

TCATCCAGCT TGTGCTCTTC ATCATCCAGG ATATTGCAGT CCTCTTCAAC ATCATCATCA 360 



TTTTCCTCAT GTTCTTCAAC ACCTTCGTCT TCCAGGCTGG CCTGGTCAAC CTCCTATTCC 420 
20 ATAAGTTCAA AGGGACCATC ATCCTGACAG CTGTGTACTT TGCCCTCAGC ATCTCCCTTC 480 



ATGTCTGGGT CATGAACTTA CGCTGGAAAA ACTCCAACAG CTTCATATGG ACAGATGGAC 540 



TTCAAATGCT GTTTGTATTC CAGAGACTAG CAGCAGTGTT GTACTGCTAC TTCTATAAAC 600 

25 

GGACAGCCGT AAGACTAGGC GATCCTCACT TCTACCAGGA CTCTTTGTGG CTGCGCAAGG 660 

AGTTCATGCA AGTTCGAAGG TGACCTCTTG TCACACTGAT GGATACTTTT CCTTCCTGGA 720 

30 TAGRAGGCCA CATTTGCTGC TTTGCAGGGG AGAGTTGGGC CCTATGCATG GGGCAAAACA 780 



GGTGGGATTT TCCAAGGGAA GGGTTCAGAA TTAGGCNTGT TGTTTCAGCC ATTTCCAAGG 840 



AAGGGGAAGG GTTTCCCTNC CCT 863 

35 



40 



(2) INFORMATION FOR SEQ ID NO: 154: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1101 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
45 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 154: 

AACAGCAAAA AAGAATGATT TCTTCTGAAA TTGTGGAACA TGAGGATTCA AGTTTTTATT 60 

50 

TTGTTACTAG GTGCTGGAGG AACATCCCAG TTCACAAAGC CCCCATCTCT TCCTCTGGAG 120 

CCAGAGCCTG CGGTGGAATC AAGTCCAACT GAAACATCAG AACAAATAAG AGAGAAATAA 180 

55 GAATAGAATG AATGACCCCA AAATARGGTT TTCTTGGGCG AGGATGTGCT GGATTAGGAA 240 

AGGTGACATG ACACAGGCAG AGCAGAGTGG CACCCACCAC AGAATACAGT GTGTGTTATT 300 

ACGAGGAGCC AGCAGTTGAG CCTAAGGTCC TTCTACCTAC CTGGTATTGG CATTTGAGGT 360 

60 
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CGGAAACCCT CTACTGCCCC ATAAGCCAGG AAAAGTGAAA AGAGAACACA GTTCCTTTAA 420 

GAACTGGCAG CAAGGCTTGA GGCCTTATGT ATGTAGCTGA GTCAGCAAGG TACATGATGC 480 

5 TGTCTGCTTT CAAAAGGACT TTTCTCTCCT AGCTGACTGA CTCCTTCCTT AGTTCAAGGA 540 

ACAGCTGAGA CAGACCTCTG CTGAGTAGCT CTGTGATGAC AAACCCTTGG TTTAACTGAG 600 

GTGATCCTCA GGTTGTGAGG TTTATTAGTC CCCAAGGCAA ACACAAATAT TAGATTAATA 660 

10 

ATCCAACTTT AATAGTATAC ATTTAAAAGA AAAAAAACAA AAGCCCTGGA AGNTTGAGGC 720 

CAAGCCTGCT GAGTATTGCA GCTGCATTTG CCCAAAGGGA ATCCAGAACA AGTCCCTCCC 780 

15 TGTATTTTGT TCTTGAGAGG GGTCAGTCTA GAAGCTAGAT CCTATCAGGA TGAGGAGCAG 840 

CAGCCCAGGG CTTGTCTGGA TCAGCAGCAA CGATTTTAAA GAAAAAAGGA AGAGTTTCTT 900 

AGATGAGTAA TTGTTATTGA AGATAGTCAG TGATAACCAC TGACCAGATG CTATCAATAC 960 

20 

ACTATGTGTC CTTTTTAGAA TAAAGATTAC ATATCATCAT TCCTTTGGGG AAAATTGTTA 1020 

TTCAGGTATA AAAACAAGAG ATTATAATAA AAAANTAAAA GAACCCTAAA AAAAAAAAAC 1080 

25 CTCGTGCCGA ATTCCCTGCA G 1101 



30 (2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2031 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

40 CAATTAACCC GTTTGAGGCC TAGGTTGTTT GGCAAGCCCC NGGCCTAAAG TTTTAATTCG 60 

GCAGAGCCAA GGGCCTGAAA GGAAGGGAAA GGGGAGGGTA GCGGGAGGGT AGCAGGTGAG 120 

TTCCTAQGGC TGGAAGGTTT AGCAGCAGCC TGGTGCAGTG CCCTGTCATC AAGACAAACC 180 

45 

CACGGTCCTC CTGGGTGCCT ACCAAGCTTG GTTTGTACAA AAGCAAGGTG GGAGTCTATT 240 

TTTGTACATG AGATACATCA CACTTACCTG TGGGCCAGTA TTGTGAAGTG AGTCTGAGTT 300 

50 GTTTACACTG ATGCCTTCCC TGCCCACCAC AAATTGTGTA CATAGTCTTC AGAATGATAC 360 

CACCCCTTTC CCCAGCTCCC AACCAAGAGC TGGTTCTAGG CCTGTGTTAT ATGTCATATT 420 

TAGCGTTTTT ATATATGACC TTTGATTTCT GTTGTTTGTA TTTTAGCACA GTGTATGCAC 480 

55 

CTTCATTTAA ATACATCTGT GTGCATACAG ATACGCATAT ATGTGTGTGC GTATGCATAT 540 

ATCTCTCATC TGTAGTTTCC AAGAGTTCAG CTGAAGCAGA TGGAGTCCTG CAGCCCAGGA 600 

60 GACACCCTGC ATCCCTGCTA ATAGTGTTTG CCACAAGTAT TAGTGAGTCT TCCTTATTAA 660 
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10 



15 



20 



25 



30 



35 



40 



45 



TATTTTCATT TCAGAAGACT GAAGCAAAGC TGATAGTGTT TGCTGTTTCT TTGGCAGCTA 720 

AGTGAGGGTC TTGGGATGAC TTGCTGTGTT CCTCAAGCTG CACTTTGGGG CCATCTCTGC 780 

AGTATTAAGC CCCCTTITTG CTTGGTGGTA CTCTGTCTGT GCCTGTGTGT GTGTGTGATA 840 

GTCACTCITG CATGGCTTCC ATGTCTGGTT TGTGGCATTT GGGGATAAGT GCTGAACCAG 900 

AGCATTTGCA G1TTGTTTGA GGCCTCGTTG CCAATGATAG ATCACTCCTG TTGACCTGGT 960 

ATGTCTGGTT GCTTGCTGCT TTTCCTTGCT TTCTCTTGGA AGAGGAAAGG ACTCTGGTCA 1020 

GGCCCAGGCT GAGTGAGATG AGCTGCAGCT GGCTCATGGC CTTCTTAGAG CAGAGAGAGG 1080 

AGTATGTCAT TTTACTAAGT TCCTAAACAA ACATTTATGC AGGCAACACT CCTTGCAGAT 1140 

CCAGAAACTG AGGCACAATA GGGTTATGAC TTGCTCAAGA ATATGTAGCT GCTAGGGGGT 1200 

AAATCAAGGC ATCACAATTT CTGTTCAGCG GGCAGGAATA GGCTGTGAAT TGCTAGCACT 1260 

TTTTTTTTAA GCAATTACTT TTTGACTTGT TCCTCTGAAA GTGCAAGAGG CGTACACCTT 1320 

TCCCAAATGT AGACTAGAAT CTGCAGGATG CCACCCACTG TATAGTTCTG CTTTCCCAGA 1380 

GAGGAAGAAC TTTTAGAAAC CAAATGATCT TAATTGTTAT TGCCCACCCC TGGCTTTTCC 1440 

GGGTAGAAAA TTCACAGTAG GAATGATTGT TAAGAGAGAG TGCTTGGAAC CATGGGTTAA 1500 

CAGGAAAGGC TACCTAACTT CACATATCTG CAACCAGAGC AGCCACCAAG CATTACTTAG 1560 

CAGCAGGAAA ATGATTGTAT TTGAGTTCCT GTGTGTCCAA AACTGAGGCA CCATGTTCTT 1620 

TGAAAACATG CCACCTCAAG GCTGGGCGCG GTGGCTCACA CCTGTTAATC CCAGCACTTT 1680 

GGGAGGCCGA GGCGGGCGGA TCACCGGAGT CGGGGAGTTT GAGACCAGCC TGGACCAACA 1740 

TGGGAGAAAC CCCATCTCTA CCTAAAAATA CAAAATTAGC CGGGCGTGGT GGCATGCGCC 1800 

TATAATCTCA GCTACTTGGG AGGGYTGAGG CAGGRGAATT GCTTGAACCC RGGANGGCGG 1860 

AGGTTTGCGG TTGAGTTGAG GATCGTGCCA TTGCACTTCC GGGCCTTGGG GCAACAACAG 1920 

CAAAAAYTCC GTCTTCAAMW MRTGCCGAAT TCGATATCAA GCTTATCGAT ACCGTCGACC 1980 

TCGAGGGGGG GCCCGGTACC CAATTCGCCC TATAGNGATC GTATTACAAT C 2031 



50 

(2) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1981 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 



WO 98/39448 

CCTGCACCCT GAGCCCTTCA CCCCTCCGAG 
TTTCTTGGTA TCAACGTTTG ATTGGAAGAA 
5 GCTCACTGTG GAGGAGCAGC TCGGGCACAG 
GACCGCAAAA ACTCTGTGTG GACACAGGAT 
GTGGTCCTGG CAGCTGAAGC CCTGCCCATG 

10 

CCTGGGGACA TCAGGACAGT GTTCCGGCCG 
CTGTYTCCTC GCCATATCCC GCGGCACCGC 
15 TGCCGGGGCC TGCTCAGCCA GCCGGGGCCC 
CCTNCTCAGC TCTATCTGAC GCAGCTCAGG 
TATGACCAGC ATCGTGGAGA GGTGATTGGT 

20 

CAGCCCTTCA AGGCCTCCAG CACAAAGGGG 
GTAATGGTGC CCAATGTTGA AGCAATCCTG 
25 GTGCAGACTG TGGAGGCCCG AAGTGAGAGG 
CTGTAGACGG ACAGCAGGAC ATTGGACCTC 
CCCTCCTTGG ACATGAATCC TCCATGGAGG 

30 

CAACAAAACC CAGCCCCAAC TTTCTCTCTG 
GCCCATGTAG TCTCCTGGGC CTCACCATCC 
35 GGAACTGAAC CCAGGAGATC CATCCACCTA 
TCCCACTCCT TTCTTAGTCT TCTTCCAGAA 
CTCTGTCTCC TTCCTGCTGC CAGGACCTGT 

40 

GTCAGGCCTT TAGATGGGAC CCAGCGAAAA 
TGAGCAGTGG AAAGGGGCTA TATGTGTATG 
45 TCCTGTGTTG ATGCCACTTC CCAGGGTGGA 
GATTGGGTAG GTGAAGGGGT CAGGGGACTG 
AGCACTGGGG GCTACCCGTT AGCTGCATCT 

50 

ACTGCCACTA CTATGTACCT GCAGTGGGGT 
CCCCAGGAGC TCCCAGGGCC CAAQGAGGAG 
55 CCACTTTCTG GTAGCCTCTC TGCTTCCTGT 
AATAGTGCCC CTCCTTAAGC CCATCCCTCG 
TGAACTATTA GAGCAGTTAC TGTCTGTTCA 

60 
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381 

TTCCCCCCAG GTTGGCTTCC TTCGATTCCT 60 

CAACCCCCTC TTTGTCAACC TCAATAATGA 120 

CTCMCCGTYA TGGTCATTGT TACCCCCCAA 180 

GGACCCTCAG CCCAGATCCT GCAGCAGCTT 240 

TTAGAGAAGC AGCTCATGGA TCCCCGGGGA 300 

CCCTTGGACA TTTACGACGT GCTGATTCGC 360 

AGGCTTGTGG ACTCGCCAGY TGCCTCCTTC 420 

TCATCCCTGA TGCCCGTGCT GGGTNATGAT 480 

GAGGCCTTTG GGGATCTGGC CCTTTTCTTC 540 

GTCCTCTGGA AGCCCACCAG CTTCCAGCCG 600 

CGCATGGTGA TGTCTCGAGG TGGGGAGCTA 660 

GAGGACTTTG CTGTGCTGGG TGAAGGCCTG 720 

TGGACTGTGT GATCCCAGCT CTGGAGCAAG 780 

TAGAGCAAGA TGTCAGTAGG ATGACCTCCA 840 

GCCTGCTGGC TGAACATGCT GAATCATCTC 900 

ATGCTCCAGC ATTGGGGCAG GGGCATGGTG 960 

CAGAAGAGGA GTGGGAGCCA GCTCAGAGAA 1020 

TTAGCCCTGG GCCTGGACCT CCCTGCGATT 1080 

ACAGAGAAGG GGATGTGTGC CTGGGAGAGG 1140 

GCCTAGACTT AGCATGCCCT TCACTGCAGT 1200 

TGTGGCCCTT CTGAGTCACA TCACCGACAC 1260 

AATAGACCAC ATTGAAGGAG CACAATGCCC 1320 

GACAGTGGAA AAGAACCGAG GACAGGAAAG 1380 

GTAGTCACCC AATCTTGGAG AGGTGCAAAA 1440 

GCCCTGGCTC 1TTGCCCGTT CATGTCACAA 1500 

TGCAGAGATG GGGGAGACTC AAGTCTTACT 1560 

AATGCTGCCT CCTTTCAGTC TGGTCTACAC 1620 

AATTCTGGCT GTTTTTCCAG ACTCAGCTCA 1680 

CCCCCAGCCT GAGGTGATCT TTCCCTCCTC 1740 

GTTCGTTTGG CAGGCACACA CAGTGGCATA 1800 
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55 



AATTCTATTG TTTTGAACTC TGATTTAAAA TTAAATTGCA GCTGGGCGTG GTGGCTCATG I860 

CTTGTAATCC CAACACTTAG GGAGTMAGGR GAATCACTTG ASCYCAGGAG TYCTAGACCA 1920 

ATCTGGGCAA MAGAGAGACC CCATCTCTTT TAAATAAAAA GTTAAATTGC TTAAAAAAAA 1980 

A 1981 



(2) INFORMATION FOR SEQ ID NO: 157: 



(i) SEQUENCE CHARACTERISTICS : 
15 (A) LENGTH: 915 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
,(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 157: 

GAATTCGGCA CGAGCGCGGC CATGGCGCTC CTGCTTTCGG TGCTGCGTGT ACTGCTGGGC 60 

GGCTTCTTCG CGCTCGTGGG GTTGGCCAAG CTCTCGGAGG AGATCTCGGC TCCAGTTTCG 120 

25 

GAGCGGATGA ATGCCCTGTT CGTGCAGTTT GCTGAGGTGT TCCCGCTGAA GGTATTTGGC 180 

TACCAGCCAG ATCCCCTGAA CTACCAAATA GCTGTGGGCT TTCTGGAACT GCTGGCTGGG 240 

30 TTGCTGCTGG TCATGGGCCC ACCGATGCTG CAAGAGATCA GTAACTTGTT CTTGATTCTG 300 

CTCATGATGG GGGCTATCTT CACCTTGGCA GCTCTGAAAG AGTCACTAAG CACCTGTATC 360 

CCAGCCATTG TCTGCCTGGG GTTCCTGCTG CTGCTGAATG TCGGCCAGCT CTTAGCCCAG 420 

35 

ACTAAGAAGG TGGTCAGACC CACTAGGAAG AAGACTCTAA GTACATTCAA GGAATCCTGG 480 

AAGTAGAGCA TCTCTGTCTC TTTATGCCAT GCAGCTGTCA CAGCAGGAAC ATGGTAGAAC 540 

40 ACAGAGTCTA TCATCTTGTT ACCAGTATAA TATCCAGGGT CAGCCAGTGT TGAAAGAGAC 600 

ATTTTGTCTA CCTGGCACTG CTTTCTCTTT TTAGCTTTAC TACTCTTTTG TGAGGAGTAC 660 

ATGTTATGCA TATTAACATT CCTCATGTCA TATGAAAATA CAAAATAAGC AGAAAAGAAA 720 

45 

TTTAAATCAA CCAAAATTCT GATGCCCCAA ATAACCACTT TTAATGCCTT GGTGTAAGTA 780 

TACCTCTGAA CTTTTTTCTG TGCCTTTAAA CAGATATATA TTTTTTTTWA ATGAAAATAA 840 

50 AACCATATAT CCTATTTTAT TTCCTCCTTT TAAAACCTTA TAAACTATAA MAAAAAAAAA 900 

AAAAAAAAAA CTCGA 915 



(2) INFORMATION FOR SEQ ID NO: 158: 



(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 2117 base pairs 
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(B) TVPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

AGAGCGAAGC GAGGGTGGCG CGGGTCCGGG CATGAAGCTG GGCCGGGCCG TGCTGGGCCT 60 

GCTGCTGCTG GCGCCGTCCG TGGTGCAGGC GGTGGAGCCC ATCAGCCTGG GACTGGCCCT 120 

GGCCGGCGTC CTCACCGGCT ACATCTACCC GCGTCTCTAC TGCCTCTTCG CCGAGTGCTG 180 

CGGGCAGAAG CGGAGCCTTA GCCGGGAGGC ACTGCAGAAG GATCTGGACG ACAACCTCTT 240 

15 TGGACAGCAT CTTGCAAAGA AAATCATCTT AAATGCCGTG TTTGGTTTCA TAAACAACCC 300 

AAAGCCCAAG AAACCTCTCA CGCTCTCCCT GCACGGGTGG ACAGGCACCG GCAAAAATTT 360 

CGTCAGCAAG ATCATCGCAG AGAATATTTA CGAGGGTGGT CTGAACAGTG ACTATGTCCA 420 

20 

CCTGTTTGTG GCCACATTGC ACTTTCCACA TGCTTCAAAC ATCACCTTGT ACAAGGATCA 480 

GTTACAGTTG TGGATTCGAG GCAACGTGAG TGCCTGTGCG AGGTCCATCT TCATATTTGA 540 

25 TGAAATGGAT AAGATGCATG CAGGCCTCAT AGATGCCATC AAGCCTTTCC TCGACTATTA 600 

TGACCTGGTG GATGGGGTCT CCTACCAGAA AGCCATGTTC ATATTTCTCA GCAATGCTGG 660 

AGCAGAAAGG ATCACAGATG TGGCTTTGGA TTTCTGGAGG AGTGGAAAGC AGAGGGAAGA 720 

30 

CATCAAGCTC AAAGACATTG AACACGCGTT GTCTGTGTCG GTTTTCAATA ACAAGAACAG 780 

TGGCTTCTGG CACAGCAGCT TAATTGACCG GAACCTCATT GATTATTTTG TTCCCTTCCT 840 

35 CCCCCTGGAA TACAAACACC TAAAAATGTG TATCCGAGTG GAAATGCAGT CCCGAGGCTA 900 

TGAAATTGAT GAAGACATTG TAAGCAGAGT GGCTGAGGAG ATGACATTTT TCCCCAAAGA 960 

GGAGAGAjGTT TTCTCAGATA AAGGCTGCAA AACGGTGTTC ACCAAGTTAG ATTATTACTA 1020 

40 

CGATGATTGA CAGTCATGAT TGGCAGCCGG AGTCACTGCC TGGAGTTGGA AAAGAAACAA 1080 

CACTCAGTCC TTCCACACTT CCACCCCCAG CTCCTTTCCC TGGAAGAGGA ATCCAGTGAA 1140 

45 TGTTCCTGTT TGATGTGACA GGAATTCTCC CTGGCATTGT TTCCACCCCC TGGTGCCTGC 1200 

AGGCCACCCA GGGACCACGG GCGAGGACGT GAAGCCTCCC GAACACGCAC AGAAGGAAGG 1260 

AGCCAGCTCC CAGCCCACTC ATCGCAGGGC TCATGATTTT TTACAAATTA T3TTTTAATT 1320 

50 

CCAAGTGTTT CTGTTTCAAG GAAGGATGAA TAAGTTTTAT TGAAAATGTG GTAACTTTAT 1380 

TTAAAATGAT TTTTAACATT ATGAGAGACT GCTCAGATTC TAAGTTGTTG GCCTTGTGTG 1440 

55 TGTGTTTTTT TTTAAGTTCT CATCATTATT ACATAGACTG TGATGTATCT TTACTGGAAA 1500 

TGAGCCCAAG CACACATGCA TGGCATTTGT TCCACAGGAG GGCATCCCTG GGGATGTGGC 1560 

TGGAQCATGA GCCAGCTCTG TCCCAGGATG GTCCCAGCGG ATGCTGCCAG GGGCAKTGAA 1620 

60 
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GTGTTTAGGT GAAGGACAAG TAGGTAAGAG GACGCCTTCA GGCACCACAG ATAAGCCTGA 1680 

AACAGCCTCT CCAAQGGTTT TCACCTTAGC AACAATGGGA GCTGTGGGAG TGATTTTGGC 1740 

5 CACACTGTCA ACATTTGTTA GAACCAGTCT TTTGAAAGAA AAGTATTTCC AACTTGTCAC 1800 

TTGCCAGTCA CTCCGTTTTG CAAAAGGTGG CCCTTCACTG TCCATTCCAA ATAGCCCACA 1860 

CGTGCTCTCT GCTGGATTCT AAATTATGTG AATTTTGCCA TATTAAATCT TCCTCATTTA 1920 

10 

TACTATTATT TGTTACGTTC AATCAGAATC CCCGAAACCT CCTATAAAGC TTAGCTGCCC 1980 

CTTCTGAGGA TGCTGAGAAC GGTGTCTTTC TTTATAAATG CAAATGGCTA CCG TTTTACA 2040 

15 ATAAAATTTT GCATGTGCAA AAAAAAAAAA ANAAAAAAAA AAAATCCCGG GGGGGGGCCG 2100 

GTAACCAATT TGNCCCC 2117 

20 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 2395 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

30 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

TGTTCCTTAA TCCCTTTTCT AAAAAGGGGG GAAAATCCGG ATGGATTTTA GGGATTGGTC 60 

TGGTGTCAGC TGTGTTTTAT TGCACACCTA AATCCTGATT ATAGGCTTTT CATTTCTCCG 120 

35 

CAAAGCCTTT ATTTTGGCAG TTAAGCCAAA TGTGTTTTCC AGAAAGTTAG TTATITTCTC 180 

CTCTTTCTTT CCTITCTTTC CTCCCTTTTT CCCGTCTGAC CCCAAACGTT ATTGTCCAAA 240 

40 CATGACTGGA CAGCAGCTTT TGTTTCTTGA CCCTGTAATA TGACAGTCTG CTAATATTGA 300 

CAGAAGGTGC AGTTTTTGGG TTATAGTCGT GATTTTCGCT AATCAATCAT ATTAGCAGGA 360 

AAAAAAAKGA CTTGTTTCTG TTGTACTTGA GTCTTAAGAA AAAGTGGCCC ATAGTTTAGT 420 

45 

GGACAATTTC CAAAGGCTTT AGTACCACCT GTATTTCAAA ATGGGGGACC CAAACTCCCG 480 

GAAGAAACAA GCTCTGAACA GACTACGTGC TCAGCTTAGA AAGAAAAAAG AATCTCTAGC 540 

50 TGACCAGTTT GACTTCAAGA TGTATATTGC CTTTGTATTC AAGGAGAAGA AGAAAAAGTC 600 

AGCACTTTTT GAAGTGTCTG AGGTTATACC AGTCATGACA AATAATTATG AAGAAAATAT 660 

CCTGAAAGGT GTGCGAGATT CCAGCTATTC CTTGGAAAGT TCCCTAGAGC TTTTACAGAA 720 

55 

GGATGTGGTA CAGCTCCATG CTCCTCGATA TCAGTCTATG AGAAGGGATG TAATTGGCTG 780 

TACTCAGGAG ATGGATTTCA TTCTTTGGCC TCGGAATGAT ATTGAAAAAA TCGTCTGTCT 840 

60 CCTGTTTTCT AGGTGGAAAG AATCTGATGA GCCTTTTAGG CCTGTTCAGG CAAATTTGAG 900 



WO 98/39448 



385 



PCT/US98/04493 



TTTCATCATG GTGACTATGA AAAACAGTTT CTGCATGTAC TGAGCCGCAA GGACAAGACT 960 

GGAATCGTTG TCAACAATCC TAACCAGTCA GTGTTTCTCT TCATTGACAG ACAGCACTTG 1020 

5 

CAGACTCCAA AAAACAAAGC TACAATCTTC AAGTTATGCA GCATCTGCCT CTACCTGCCA 1080 

CAGGAACAGC TCACCCACTG GGGCAGTTGG CACCATAGAG GRTCACCTCC GTCCTTATAT 1140 

10 GCCAGAGTAG AGTACTGACC AGCAAAATGG AGAAGATCAG AGAATGCAGC AGCAGTTTTT 1200 

TTTCrTGTTT TCTTACCACT TTATTCTTTC AGAGTTTAAA GAAAATGGAC TCATGCACAG 1260 

AACACTATGC ATTTTGAAAC TTGTTCATCC TGGATTTTTT TAAATCATTT TTATCTCAGA 1320 

15 

ACTTAAACAA AAATTAGATG TCGTGCACGG ACTGTGTGAA AGAAGATGCT TTGCATATTT 1380 

GCTGCACTGC ATCAGTATCT TACTAAAAAT GTGAAATGAA AGGACTATTG TACACTGAAA 1440 

20 TGCTTAAATG TATCTGAAAG CACAAGGTGA TACTCATTTT TATGGTCTTC CCATTTGTGC 1500 

TGGTTTTTGC CTCTTTGACA TCTGTCATCA GTATTTAGAG GGTGAGAAGT GAATGTAACA 1560 

GGTATAAATA ACATTTTTAA AAACAATAAC TTTGCTATAA TCACAGTTGT TCCAGAGCAC 1620 

25 

TGTCAGATAC ATTCTAATGA CCAGAACTGG TTTAAAAAAA GAAAATACAA CCATGGGAAA 1680 

GAAATCTTAA ATGAAAAACG CATCTCATTG TAGGCATTTT TGCCTCATAT TTTACTGGGC 1740 

30 CATGTTTGTT TCCTGGTACT CATGTATTTT TTTTTTCCAG ATCTCTTTCC CCAAGTTGCT 1800 

ATTGTAAGAG TATTCTGCTG CGTGTGGATG CAGTTATACA CATTAAAGCA GATCTGGAGT 1860 

CTGAAGTAGC TATAAAGCAG CTATAAAACA GAAATACATG CATAGCTGCA GAAACCATGA 1920 

35 

TAGGTAGAGG ACTTTTCTTT TGGTTTTGTT TTGTTTTGTT TTGTTTTGTT TTTGGTTTTA 1980 

CAGAGAAGAG AITTTTATTA CAAAGAAAAA AATTCCAGTG AATTGTGCAG AAATGCTGGT 2040 

40 TTTTACACCA TCCTAAAGAA AAACTTTACA AGGGTGTTTT GGAGTAGAAA AAAGGTTATA 2100 

AAGTTGGAAT CTTAAATTGT AAAATTAACC ATTGAGTGTC AAAGTTCTAA AAGCAGAACT 2160 

CATTTTGTGC AATGAACATA AGGAAAGACT ACTGTATAGG TTTTTTTTTT TTCTCCTTTT 2220 

45 

AAATGAAGAA AAGCTTTGCT TAAGGGTTGC ATACTTTTAT TGGAGTAAAT CTGAATGATC 2280 

CTACTCCTTT GGAGTAAAAC TAGTGCTTAC CAGTTTCCAA TTGTATTTAG CTTCTGGTTG 2340 

50 GAATTTGAAA AAAAAAGAAA AAAAGAAAAA GAAAACCTAA ATAAAATAGG TGAAA 2395 



55 (2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2120 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS : double 



WO 98/39448 



386 



PCT/US98/04493 



10 



20 



(D) TOPOLOGY: linear 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 

CCCCGGATAC CGCCTGACGT AGTGCCAATC ACACCTCTCG CGTCTCGGCG CCTCGGAGGC 60 

TAATGAGGAC GCCTGGCGAA ACGCAGTAAC GGATTTCCGG GTGGACCTTC GCTTTACGGC 120 

TCGTGAGTTC TTCCGCCCAA CCCAGAGGAA GCGGGAGAGC AGTTTACGAC AGCGCCGGTC 180 

GTGTTTACGG CGGCGCCCGC TGCGCGCGCA TGTTTCCTCT TTTCCTGGTT TCTCAAGAGT 240 

GCTGCTGCTA ACGCGGTCCC CGGCACGCAC CATCTGTTGC CATCCCGGCC GGCCGAGGCA 300 

TTGCAGATTT TGGAAGATGG CAAAGTTCAT GACACCCGTG ATCCAGGACA ACCCCTCAGG 360 

CTGGGGTCCC TGTGCGGTTC CCGAGCAGTT TCGGGATATG CCCTACCAGC CGTTCAGCAA 420 

AGGAGATCGG CTAGGAAAGG TTGCAGACTG GACAGGAGCC ACATACCAAG ATAAGAGGTA 480 

CACAAATAAG TACTCCTCTC AGTTTGGTGG TGGAAGTCAA TATGCTTATT TCCATGAGGA 540 

GGATGAAAGT AGCTTCCAGC TGGTGGATAC AGCGCGCACA CAGAAGACGG CCTACCAGCG 600 

25 GAATCGAATG AGATTTGCCC AGAGGAACCT CCGCAGAGAC AAAGATCGTC GGAACATGTT 660 

GCAGTTCAAC CTGCAGATCC TGCCTAAGAG TGCCAAACAG AAAGAGAGAG AACGCATTCG 720 

ACTGCAGAAA AAGTTCCAGA AACAATTTGG GGTTAGGCAG AAATGGGATC AGAAATCACA 780 

30 

GAAACCCCGA GACTCTTCAG TTGAAGTTCG TAGTGATTGG GAAGTGAAAG AGGAAATGGA 840 

TTTTCCTCAG TTGATGAAGA TGCGCTACTT GGAAGTATCA GAGCCACAGG ACATTGAGTG 900 

35 TTGTGGGGCC CTAGAATACT ACGACAAAGC CTTTGACCGC ATCACCACGA GGAGTGAGAA 960 

GCCACTGCGG ASATNCAAGC GCATCTTCCA CACTGTCACC ACCACAGACG ACCCTGTCAT 1020 

CCGCAAGCTG GCAAAAACTC AGGGGAATGT GTTTGCCACT GATGCCATCC TGGCCACGCT 1080 

40 

GATGAGCTGT ACCCGCTCAG TGTATTCCTG GGATATTGTC GTCCAGAGAG TTGGGTCCAA 1140 

ACTCTTCTTT GACAAGAGAG ACAACTCTGA CTTTGACCTC CTGACAGTGA GTGAGACTGC 1200 

45 CAATGAGCCC CCTCAAGATG AAGGTAATTC CTTCAATTCA CCCCGCAACC TGGCCATGGA 1260 

GGCAACCTAC ATCAACCACA ATTTCTCCCA GCAGTGCTTG AGAATGQGGA AGGAAAGATA 1320 

CAACTTCCCC AACCCAAACC CGTTTGTGGA GGACGACATG GATAAGAATG AAATCGCCTC 1380 

50 

TGTTGCGTAC CGTTACCGCA GTGGNAAGCT TGGAGATGAT ATTGACCTTA TTGTCCGTTG 1440 

TGAGCACGAT GGCGTCATGA CTGGAGCCAA CGGGGAAGTG TCCTTCATCA ACATCAAGAC 1500 

55 ACTCAATGAG TGGGATTCCA GGCACTGTAA TGGCGTTGAC TGGCGTCAGA AGCTGGACTC 1560 

TCAGCGAGGG GCTGTCATTG CCACGGAGCT GAAGAACAAC AGCTACAAGT TGGCCCGGTG 1620 

GACCTGCTGT GCTTTGCTGG CTGGATCTGA GTACCTCAAG CTTGGTTATG TGTCTCGGTA 1680 

60 
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CCACGTGAAA GACTCCTCAC GCCACGTCAT 
GTTTGCCAGC CAGATCAACC TGAGCGTGGA 
5 TGACATCTGC ATGAAGCTGG AGGAGGGCAA 
GGTCATCCGT G TCT AC AGC C TCCCTGATGG 
AGAGGAGGAG GAAGAAGAGG AAGAAGAAGA 

10 

GGAGTTTGTC CTTCCACCGA GACTACGAGG 
ACTTGCTCTC TGACATTTAG CAGATGAAAT 
15 AAAAAAAAAA AAAAAAAAAN 



CCTAGGCACC CAGCAGTTCA AGCCTAATGA 1740 

GAATGCCTGG GGCATTTTAC GCTGCGTCAT 1800 

ATAOCTCATC CTCAAGGACC CCAACAAGCA I860 

CACCTTCAGC TCTGATGAAG ATGAGGAGGA 1920 

GGAAGAAACT TAAACCAGTG ATGTGGAGCT 1980 

GCCTTTGATG CTTAGTGGAA TGTGTGTCTA 2040 

AAAATATATA TCTGTTTAGT CTTAAAAAAA 2100 

2120 



20 (2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 900 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEENESS : double 

<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 

30 GGAAGCTGAA GTCCTTCCAG ACCAGGGACA ACCAGGGCAT TCTCTATGAA GCTGCACCCA 60 

CCTCCACCCT CACCTGTRAC TCAGGACCAC AGAAGCAAAA GTTCTCACTC AAACTGGATG 120 

CCAAGGATGG GCGCTTGTTC AATGAGCAGA ACTTCTTCCA GCGGGCCGCC AAGCCTCTGC 180 

35 

AAGTCAACAA GTGGAAGAAG CTGTACTCGA CCCCACTGCT GGCCATCCCT ACCTGCATGG 240 

GTTTCGGTGT TCACCAGGAC AAATACAGGT TCTTGGTGTT ACCCAGCCTG GGGAGGAGCC 300 

40 TTCAGTCGGC CCTGGATGTC AGCCCAAAGC ATGTGCTGTG CAGAGAGGTC TGTGCTGCAG 360 

GTGGCCTGCC GGCTGCTGGA TGCCCTGGAG TTCCTCCATG AGAATGAGTA TGTTCATGGA 420 

AATGTGACAG CTGAAAATAT CTTTGTGGAT CCAGAGGACC AGAGTCAGGT GACTTTGGCA 480 

45 

GGCTATGGCT TCGCNTTCCG CTATTGCCCA AGTGGCAAAC ACGTGGCCTA CGTGGAAGGC 540 

AGCAGGAGCC CTCACGAGGG GGACCTTGAG TTCATTAGCA TGGACCTGCA CAAGGGATGC 600 

50 GGGCCCTCCC GCCGCRGCGA CCTCCAGAGC CTGGGCTACT GCATGCTGAA GTGGCTCTAC 660 

GGGTTTCTGC CATGGACAAA TTGCCTTCCC AAMAWTGAGG ACATCATGAA GCAAAAACAG 720 

AAGTTTGTTG ATAAGCCGGG GCCCTTCGTG GGACCCTGCG GTCACTGGAT CAGGCCCTCA 780 

55 

GAGACCCTGC AGAAGTACCT GAAGGTGGTG ATGGCCCTCA CGTATGAGGA GAAGCCGCCC 840 

TACGCCATGC TGAGGAACAA CCTAGAAGCT TTGCTGCAGG ATCTGCGTGT GTCTCCATAT 900 

60 
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(2) INFORMATION FOR SEQ ID NO: 162: 



5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1003 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY; linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 



GGCACGAGAT GAGGGGCACC CAGTGCTTCT AGGGCAGGCT GGGTGGTGGT CCCCTAGGTA 60 
15 TCAGCCTCTC TTACTGTACT CTCCGGGAAT GTTAACCTTT CTATTTTCAG CCTGTGCCAC 120 



CTGTCTAGGC AAGCTGGCTT CCCCATTGGC CCCTGTGGGT CCACAGCAGC GTGGCTGCCC 180 



CCCAGGGCCA CCGCTTCTTT CTTGATCCTC TTTCCTTAAC AGTGACTTGG GCTTGAGTCT 240 

20 

GGCAAGGAAC CTTGCTTTTA GCTTCACCAC CAAGGAGAGA GGTTGACATG ACCTCCCCGC 300 



CCCCTCACCA AGGCTGGGAA CAGAGGGGAT GTGGTGAGAG CCAGGTTCCT CTGGCCCTCT 360 
25 CCAGGGTGTT TTCCACTAGT CACTACTGTC TTCTCCTTGT AGCTAATCAA TCAATATTCT 420 



TCCCTTGCCT GTGGGCAGTG GAGAGGCTGC TGGGTGTACG CTGCACCTGC CCACTGAGTT 480 



GGGGAAAGAG GATAATCAGT GAGCACTGTT CTGCTCAGAG CTCCTGATCT ACCCCACCCC 540 

30 

CTAGGATCCA GGACTGGGTC AAAGCTGCAT GAAACCAGGC CCTGGCAGCA AACCTGGGAA 600 



TGGCTGGAGG TGGGAGAGAA CCTGAACTTC TCTTTCCCTC TCCCTCCTCC AACATTACTG 660 
35 GAACTCTATC CTGTTAGGAT CTTCTGAGCT TGTTTCCCTG CTGGGTGGGA CAGAGGACAA 720 



AGGAGAAGGG AGGGTCTAGA AGAGGCAGCC CTTCTTTGTC CTCTGGGGTA AATGAGCTTG 780 



ACCTAGAGTA AATGGAGAGA CCAAAAGCCT CTGATTTTTA ATTTCCATAA AATGTTAGAA 840 

40 

GTATATATAT ACATATATAT ATTTCTTTAA ATTTTTGAGT CTTTGATATG TCTAAAAATC 900 



CATTCCCTCT GCCCTGAAGC CTGAGTGAGA CACATGAAGA AAACTGTGTT TCATTTAAAG 960 



45 ATGTTAATTA AATGATTGAA ACTTGAAAAA AAAAAAAAAA AAA 1003 



50 (2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2196 base pairs 

(B) TYPE: nucleic acid 
55 (c) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 



60 



AAGAAGCGGC ACACGGATGT GCAGTTCTAC ACAGAAGTGG GAGAGATAAC CACGGACTTG 



€0 
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GGGAAACATC AGCATATGCA TGACCGAGAT GACCTCTATG CTGAGCAGAT GGAACGAGAA 120 

ATGAGGCACA AACTGAAAAC AGCCTTTAAA AATTTCATTG AGAAAGTAGA GGCTCTAACT 180 

5 

AAGGAGGAAC TGGAATTTGA AGTGCCTTTT AGGGACTTGG GATTTAACGG AGCTCCCTAT 240 

AGGAGTACCT GCCTCCTTCA GCCCACTAGT AGTGCGCTGG TAAATGCTAC GGAATGGCCA 300 

10 CCTTTTGTGG TGACATTGGA TGAGGTAGAG CTGATCCACT TTRAGCGGGT CCAGTTTCAC 360 

CTGAAGAACT TTGATATGGT AATCGTCTAC AAGGACTACA GCAAGAAAGT GACCATGATC 420 

AACGCCATTC CTGTAGCCTC TCTTGACCCC ATCAAGGAAT GGTTGAATTC CTGCGACCTG 480 

15 

AAATACACAG AAGGAGTACA GTCCCTCAAC TGGACTAAAA TCATGAAGAC CATTGTTGAT 540 

GACCCTGAGG GCTTCTTCGA ACAAGGTGGC TGGTCTTTCC TGGAGCCTGA GGGTGAGGGG 600 

20 AGTGATGCTG AAGAAGGGGA TTCAGAGTCT GAAATTGAAG ATGAGACTTT TAATCCTTCA 660 

GAAGATGACT ATGAAGAGGA AGAGGAGGAC AGTGATGAAG ATTATTCATC AGAAGCAGAA 720 

GAGTCAGACT ATTCTAAGGA GTCATTGGGT AGTGAAGAAG AGAGTGGAAA GGATTGGGAT 780 

25 

GAACTGGAGG AAGAAGCCCG AAAAGCGGAC CGAGAAAGTC GTTACGAGGA AGAAGAAGAA 840 

CAAAGTCGAA GTATGAGCCG GAAGAGGAAG GCATCTGTGC ACAGTTCGGG CCGTGGCTCT 900 

30 AACCGTGGTT CCAGACACAG CTCTGCACCC CCCAAGAAAA AGAGGAAGTA ACTTCTGAAC 960 

TTTGGCCCTG AGCTCCATTC TTCCTCCAGC CAACCCCTGA AAATTTTACA TGACATAGAA 1020 

ACTGTATTTT TCCTTTCGTT TTCATTTGAA GTTTTGCCAT TTGTGTTTAT GGGTTTAGGG 1080 

35 

GGCCATTTGT GTGGACCAAT CTACTCGGGG AATTCCAGGC CCACCAGGAC ACGTGCCAAT 1140 

GGCCCCATTC AGATGGCAAG GGAGGAGGTG TTCTTGAAGA CAGGAGGAGG CTCCCGCTGT 1200 

40 TAATAAATAT TGTTTCATTC TTCTCTCTTC CTGTCACCTT CTGCCAAGAC A1TGATGGCT 1260 

TCTGACATCT TATTTGGTGT CTCAAAGCTG TATTTCCAAG ACAGTGGTAC AAGGTGACCC 1320 

TTAATTACCC GTATCATGGT TCTTGACCAG CACATTCAAT CCTCCAACCT ACCCTACTGC 1380 

45 

CATGACCTTC CGCACATCTC TAAGTTTTAT CTTTGCAATA CTCAAGGTTC TCGGAAATTT 1440 

GCTAATGGTT GTGATAAACC ATACAGCTTG AGCCAGTGAG GCAGATTGGG CTGGTGCCTT 1500 

50 CGTCTGAGTT TTCCTGCTTT CCTGCCTCGT GCAGATTCTG AGGTATATCT GCTGCCTTGG 1560 

AAGACATAAG AAGCAGTGAT ACTCCCTGGC TCGGTTATTT TCTCCATACA ATGCACACAT 1620 

GGTACAATGA TAGAAGGCAA AATTGCCACT GTCTTCTTTT TTTTCTCATA TATCTAAGGA 1680 

55 

AGATATATCA GGTTGTGCCT CATGTACCGC TTCTAGTGAA ATGTAGAGGA AGGCTCAAAG 1740 

GAGTCAACAT TTAGATCTGG AAGGGACAAG TCATGCCTTG GGCCTAGAAT ACCCTGATGA 1800 

60 GAAAAGAGAA GAGGAAGGGA GGCCATATCT ACAACANCAN CCTCTCGGCA CTGCTGCTCC 1860 
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TTATTTTAAC TTTGTCTTGC ATTGTCCTGT ATTTATCACA GTTTCTGTTG AACAGCTTTT 1920 

CAAGTATTTG GGGAGTTTAT CTTGCCATCC TCCCCTTCTG GTTCTCTGCA CCCACCTGTC 1980 

5 

CCACTGCAGT TCCTTCCGTG CTCTGTGACT TTAAGAGAA3 AAGQGGGGAG GGGTCCCGGA 2040 

TTTTATGTTT GTTTGTTTTT TCTCCTTAGC AGTAGGACTT GATATTTTCA ATTTTGGAAG 2100 

10 AACTAAAAGA TGAATAAACT GGGTTTTTTT TGTTGTTTGT TTTTGTAAAA AAAAAAAAAA 2160 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAA 2196 



15 



(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 1945 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 

GCACAGAGTC GGGCGGACGG ACAGGGAGAG GAGGAGAGGG GGTCTGCGCG CGGCCGCTAC 60 

CCAGAAGCCA GCGGACGGCA GCACGGAGTG GGCTGTCCCC GAGCCCAGCC CCGAGCGAGC 120 

30 

CCCCCCCCCG CCCCCGMAGG ACGCGCCTYC CAGCCAGCCC GACTYCTAGG AGGAGGGGAG 180 

GCGGGAAAGC AGCTCAAGCC TCACCCACCG CCCTGCCCCC AGCCCCGCCA CTCCCAGGCT 240 

35 CCTCGGGACT CGGCGGGTCC TCCTGGGAGT CTCGGAGGGG ACCGGCTGTG CAGACGCCAT 300 

GGAGTTGGTG CTGGTCTTCC TCTGCAGCCT GCTGGCCCCC ATGGTCCTGG CCAGTGCAGC 360 



TGAAAAGGAG AAGGAAATGG ACCCTTTTCA TTATGATTAC CAGACCCTGA GGATTGGGGG 420 

40 

ACTGGTGTTC GCTGTGGTCC TCTTCTCGGT TGGGATCCTC CTTATCCTAA GTCGCAQGTG 480 

CAAGTGCAGT TTCAATCAGA AGCCCCGGGC CCCAGGAGAT GAGGAAGCCC AGGTGGAGAA 540 

45 CCTCATCACC GCCAATGCAA CAGAGCCCCA GAAAGCAGAG AACTGAAGTG CAGCCATCAG 600 

GTGGAAGCCT CTGGAACCTG AGGCGGCTGC TTGAACCTTT GGATGCAAAT GTCGATGCTT 660 



AAGAAAACCG GCCACTTCAG CAACAGCCCT TTCCCCAGGA GAAGCCAAGA ACTTGTGTGT 720 

50 

CCCCCACCCT ATCCCCTCTA ACACCATTCC TCCACCTGAT GATGCAACTA ACACTTGCCT 780 

CCCCACTGCA GCCTGCGGTC CTGCCCACCT CCCGTGATGT GTGTGTGTGT GTGTGTGTGT 840 

55 GTGACTGTGT GTGTTTGCTA ACTGTGGTCT TTGTGGCTAC TTGTTTGTGG ATGGTATTGT 900 

GTTTGTTAGT GAACTGTGGA CTCGCTTTCC CAGGCAGGGG CTGAGCCACA TGGCCATCTG 960 



CTCCTCCCTG CCCCCGTGGC CCTCCATCAC CTTCTGCTCC TAGGAGGCTG CTTGTTGCCC 1020 

60 
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GAGACCAGCC CCCTCCCCTG ATTTAGGGAT 
CAGTCGTCTT GGGACCTGGG AAGGTTTGCA 
5 TTCACTCCTT TAACAAAAAC CTTGCTTCCT 
TTAGCAACTG GAGATACAAA GCAAGGAGCT 
ATGCCCTTCC GTGGTTAATT TCTTCCCAGG 

10 

CCXTCTTCACA GAGCGCCCGG GGATTCCAGG 
GTGTCCCCTG CATATCTTCT CAGCAATAAC 
15 ACCTTCCCTG CTTCTGAGAC TTCAATCTAC 
CCCTGCAATT GGGTCTCTGG CAGGCAATAG 
ACACCGGGAT GGATGGAGGG AGAGCAGAGG 

20 

TGGGCAGCAG AGGCAACTCC CGCATCCTTT 
CGAGGTGGGT TGGAGACTCA GCAGGCTCCG 
25 GGTCATAACG AGAGTGGGAA CTCAACCCAG 
CGGAAACCAA CCAAACCGTG CGCTGTGACC 
TCAACAACAA CAGAAAAAAG GAATAAAATA 

30 

AAAAAAAAAA AAAAAAAAAA CTCGA 



GCGTAGGGTA AGAGCACGGG CAGTGGTCTT 10 BO 

GCACTTTGTC ATCATTCTTC ATGGACTCCT 1140 

TATCCCACCT GATCCCAGTC TGAAGGTCTC 1200 

GGTGAGCCCA GCGTTGACGT CAGGCAGGCT 1260 

GGCTTCCACG AGGAGTCCCC ATCTGCCCCG 1320 

CCCAGGGCTT CTACTCTGCC CCTGGGGAAT 1380 

TCCATGGGCT CTGGGACCCT ACCCCTTCCA 1440 

AGCCCAGCTC ATCCAGATGC AGACTACAGT 1500 

TTGAAGGACT CCTGTTCCGT TGGGGCCAGC 1560 

CCTTTGCTTC TCTGCCTACG TCCCCTTAGA 1620 

GCTCTGCCTG TCRGTGGTCA GAGCGGTGAG 1680 

TGCAGCCCTT GGGAACAGTG AGAGGTTGAA 1740 

ATCCCGCCCC TCCTGTCCTC TGTGTTCCCG 1800 

CATTGCTGTT CTCTGTATCG TGATCTATCC 1860 

TCCTTTGTTT CCTAGTGAAA AAAAAAAAAA 1920 

1945 



35 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 165: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2933 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

GGGTCGACCC ACGCGTCCGG CAGCCGTCGT TTGAGTCGTT GCTGCCGCTG CCCCCTCCCG 60 

GATCAGGAGC CAGTGTATAC CGCCCGCCCA CCGCCTTGGT GCCGCTAGAG GAAACGAGAA 120 

50 GGAGGCCGCC TGCGGTTTGT CGCCGCAGCT CGCCCMCYGY CYGGRAGAGC CGAGCCCCGG 180 

CCCAGTCGGT CGCYTGCCAC CSCTCGTAGC CGTTACCCGC GGGCCGCCAC AGCCGCCGGC 240 

CGGGAGAGGC GCGCGCCATG GCYTCTGGAG CCGATTCAAA AGGTGATGAC CTATCAACAG 300 

CCATTCTCAA ACAGAAGAAC CGTCCCAATC GGTTAATTGT TGATGAAGCC ATCAATGAGG 360 

ACAACAGTGT GGTGTCCTTG TCCCAGCCCA AGATGGATGA ATTGCAGTTG TTCCGAGGTG 420 

60 ACACAGTGTT GCTGAAAGGA AAGAAGAGAC GAGAAGCTGT TTGCATCGTC CTTTCTGATG 480 
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ATACTTGTTC TGATGAGAAG ATTCGGATGA ATAGAGTTGT TCGGAATAAC CTTCGTGTAC 540 



35 



600 



GCCTAGGGGA TCTCATCAGC ATCCAGCCAT GCCCTGATGT GAAGTACGGC AAACGTATCC 
5 ATGTGCTCCC CATTGATGAC ACAGTGGAAG GCATTACTGG TAATCTCTTC GAGGTATACC 660 

TTAAGCCGTA CTTCCTGGAA GCGTATCGAC CCATCCGGAA AGGAGACATT rrTCTTGTCC 720 
10 GTGGTGGGAT GCGTGCTGTG GAGTTCAAAG TGGTGGAAAC AGATCCTAGC CCTTATTGCA 780 

TTOITCCTCC AGACACAGTG ATCCACTGCG AAGGGGAGCC TATCAAACGA GAGGATGAGG 840 

AAGAGTCCTT GAATGAAGTA GGGTATGATG ACATTGGTGG CTGCAGGAAG CAGCTAGCTC 900 
1 AGATAAAGGA GATGGTGGAA CTGCCCCTGA GACATCCTGC CCTCTTTAAG GCAATTGGTG 960 

TGAAGCCTCC TAGAGGAATC CTGCTTTACG GACCTCCTGG AACAGGAAAG ACCCTGATTG 1020 
20 CTCGAGCTGT AGCAAATGAG ACTGGAGCCT TCTTCTTCTT GATCAATGGT CCTGAGATCA 1080 

TGAGCAAATT GGCTGGTGAG TCTGAGAGCA ACCTTCGTAA AGCCTTTGAG GAGGCTGAGA 1140 

AGAATGCTCC TGCCATCATC TTCATTGATG AGCTAGATGC CATCGCTCCC AAAAGAGAGA 1200 

25 

AAACTCATGG CGAGGTGGAG CGGCGCATTG TATCACAGTT GTTGACCCTC ATGGATGGCC 1260 
TAAAGCAGAG GGCACATGTG A1TGTTATGG CAGCAACCAA CAGACCCAAC AGCATTGACC 1320 
30 CAGCTCTACG GCGATTTGGT CGCTTTGACA GGGAGGTAGA TATTGGAATT CCTGATGCTA 
CAGGACGCTT AGAGATTCTT CAGATCCATA CCAAGAACAT GAAGCTGGCA GATGATGTGG 
ACCTGGAACA GTAGCCAATG AGACTCACGG GCATGTGGGT GCTGACTTAG CAGCCCTGTG 
CTCAGAGGCT GCTCTGCAAG CCATCCGCAA GAAGATGGAT CTCATTGACC TAGAGGATGA 1560 
GACCATTGAT GCCGAGGTCA TGAACTCTCT AGCAGTTACT ATGGATGACT TCCGGTGGGC 
40 CTTGAGCCAG AGTAACCCAT CAGCACTGCG GGAAACCGTG GTAGAGGTGC CACAGGTAAC 
CTGGGAAGAC ATCGGGGGCC TAGAGGATGT CAAACGTGAG CTACAGGAGC TGGTCCAGTA 
TCCTGTOGAG CACCCAGACA AATTCCTGAA GTTTGGCATG ACACCTTCCA AGGGAGTTCT 

45 

GTTCTATGGA CCTCCTGGCT GTGGGAAAAC TTTGTTGGCC AAAGCCATTG CTAATGAATG 
CCAGGCCAAC TTCATCTCCA TCAAGGGTCC TGAGCTGCTC ACCATGTGGT TTGGGGAGTC 

50 TGAGGCCAAT GTCAGAGAAA TCTTTGACAA GGCCCGCCAA GCTGCCCCCT GTGTGCTATr 
CTTTGATGAG CTGGATTCGA TTGCCAAGGC TCGTGGAGGT AACATTGGAG ATGGTGGTGG 
GGCTGCTGAC CGAGTCATCA ACCAGATCCT GACAGAAATG GATGGCATGT CCACAAAAAA 2100 

55 AAATGTGTTC ATCATTGGCG CTACCAACCG GCCTGACATC ATTCATCCTG CCATCCTCAG 2160 
ACCTGGCCGT CTTGATCAGC TCATCTACAT CCCACTTCCT GATGAGAAGT CCCGTGTTGC 2220 

60 CATCCTCAAG GCTAACCTGC GCAAGTCCCC AGTTGCCAAG GATOTGGACT TGGAGTTCCT 2280 



1380 
1440 

1500 



1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
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GGCTAAAATG ACTAATGGCT TCTCTGGAGC TGACCTGACA GAGATTTGCC AGCGTGCTTG 2340 

CAAGCTGGCC ATCCGTGAAT CCATCGAGAG TGAGATTAGG CGAGAACGAG AGAGGCAGAC 2400 

5 

AAACCCATCA GCCATGGAGG TAGAAGAGGA TGATCCAGTG CCTGAGATCC GTCGAGATCA 2460 

CTTTGAAGAA GCCATGCGCT TTGCGCGCCG TTCTGTCAGT GACAATGACA TTCGGAAGTA 2520 

10 TGAGATGTTT GCCCAGACCC TTCAGCAGAG TCGGGGCTTT GGCAGCTTCA GATTCCCTTC 2580 

AGGGAACCAG GGTGGAGCTG GCCCCAGTCA GGGCAGTGGA GGCGGCACAG GTGGCAGTGT 2640 

ATACACAGAA GACAATGATG ATGACCTGTA TGGCTAAGTG GTGGTGGCCA GCGTGCAGTG 2700 

15 

AGCTGGCCTG CCTGGACCTT GTTCCCTGGG GGTGGGGGCG CTTGCCCAGG AGAGGGACCA 2760 

GGGGTGCGCC CACAGCCTGC TCCATTCTCC AGTCTGAACA GTTCAGCTAC AGTCTGACTC 2820 

20 TGGACAGGGG GTTTCTGTTG CAAAAATACA AAACAAAAGC GATAAAATAA AAGCGATTTT 2 880 

CATTTGGTAA AAAAAAAAAA AAAAAAAAAT CCGGGGGGGG GCCCGAACCA TIT 2 933 



25 



(2) INFORMATION FOR SEQ ID NO: 166: 



(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 2243 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

TCGGAGAGCC GGCGGGCGNG CGCCTCTCGG CCAGGAAGCG CCTCTTGGAC GCGTGTNACC 60 

GATGCCCAGA AGTGGCCTTG GGCTGGGGAT CACCATAGCT TTTCTAGCTA CGCTGATCAC 120 

40 

GCAGTTTCTC GTGTATAATG GTGTCTATCA GTATACATCC CCAGATTTCC TCTATATTCG 180 

TTCTTGGCTC CCTTGTATAT TTTTCTCAGG AGGCGTCACG GTGGGGAACA TAGGACGACA 240 

45 GTTAGCTATG GGTGTTCCTG AAAAGCCCCA TAGTGATTGA GTCTTCAAAA CCACCGATTC 300 

TGAGAGCAAG GAAGATTTTG GAAGAAAATC TGACTGTGGA TTATGACAAA GATTATCTTT 360 

TTTCTTAAGT AATCTATTTA GATCGGGCTG ACTGTACAAA TGACTCCTGG AAAAAACTCT 420 

50 

TCACCTAGTC TAGAATAGGG AGGTGGAGAA TGATGACTTA CCCTGAAGTC TTCCCTTGAC 480 

TGCCCGCACT GGCGCCTGTC TGTGCCCTGG AGCATTCTGC CCAGGCTACG TGGGTTCAGG 540 

55 CAGGTGGCAG CTTCCCAAGT ATTCGATTTC ATTCATGTGA TTAAAACAAG TTGCCATATT 600 

TCAAAGCCTT GAACTAAGAC TCAATTACCA ACCCGCAGTT TTGTGTCAGT GCCCAAAGGA 660 

GGTAGGTTGA TGGTGCTTAA CAAACATGAA GTATGGTGTA ATAGGAATAA TATTTATCCA 720 

60 
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394 

AAAGATTTTT AAAAATAGGG CTGTGTXTAA AAAAAAAAAC AAAACARGAA AAGCAGCAGT 780 

GATTATAGAG AGGTCACACT CTAAGTGGGG TCGCGGCGTG GCCACGCTTC ACGGTCACGC 840 

5 TCGTCCGTCC TGCAGTGGCG TGTTTACATG GTCACACGTG TGTGTATCAC CAGTGGGTCA 900 

ACTGCTTGTC ATTCCTCCCG TGGCAGTTTG TGTAGACAAT CTTACTGAGC AAAAGGCAAT 960 

GAAAAGTCTT GGTTCCCACA CTGCGATATA 1TGGAATTTT CACCTCAGTT TATGAAGTTT 1020 

10 

ATTTCGAAAT CCATAGTCAT CTAAGAATGA ATACCTGTCT GCCATGTATT TCAATCTTAG 1080 

TGAGCCAAAA TTGTTTGTTT GTTACTACAG AATAGAGATG ACTGTTTTTT GCCACAGCCC 1140 

15 TATGGRATTT GCAATCTGTG ATTGCCTTGT AAAAAGGAGA GTGCATATGG CACTGCATTA 1200 

AACGTGTGGT GTTTCTAGTC AATGATATTG GTGAGCACAA TGTATTCATT TAATGGCATA 1260 

GACCATACCA GACCTAATTT GCAAGTATTG GGTCTTAAAC 1TCAAGTGCA ATGTATATGA 1320 

20 

AAACCAATCT GAGCCTTGTA TCTCTTAAAT ATTTATTTTT TTTAACGTGT GAGATGTTCG 1380 

AGAGAAGGTT CTCCATTCAT TTCAGTGCTG CCTGGAGGAA ACTCGGCAAT GATTTCTTTC 1440 

25 AGTTGTGAAG TTCCTTTCGT GTTACACCCT CCACTGAACC CTCAACCTTC GAAATACTCC 1500 

AG TT TTCTGG GTTTGGTCAT TTTTACTTAT AAATTTACCT TTTTGTATTT TGCAATTTAC 1560 

ATGTGTTTGG TTTGTTTTAA ATTCTGTGAA AGTGGCTTGA TTAAAAGACT CCTTTTAAAT 1620 

30 

GGAAGCCACC AGTCAGCAGA ATGGAAGCTT AGAGGAACTT GCCTGTGAGC GCTGGTCTTT 1680 

GTGTTTGGTT TTGTGATGTA ACGATCTTTG CTGGGGTTTT TTGCTTTGTT TTGAGGGAAA 1740 

35 TGTCTTGGAG TAAATTTTAA GTTCCTGGAG TTAATTTGTT TTACAGGAAT TTTGTTTTTT 1800 

AAAAAAATAG GATCATTCTG AACTTTGGAA TGACCCCCTT ATATATTTTC TGAAAATGAA 1860 

AACAGTTACA TGAAAAAAAT TTCCAATGAA GATGTCAGCA TTTTATGAAA AACCAGAAGT 1920 

40 

TATTAGATGA AAGCAGCGAG TGAATCTTTA AAACAGACTT GATCACGCAC ACACAATAAG 1980 

TCTTTCTCTC CGAAACCGGA AGTAAATCTA TATCTGTTAG AAATAATGTA GCCAAAAGAA 2040 

45 TGTAAATTTG AGGATTTTTT TGCCAATAGT TTATAGAAAA TATATGAACC AAAGTGATTT 2100 

GAGTTTGTAA AAATGTAAAA TAGTATGAAC AAAATTTGCA CTCTACCAGA TTTGAACATC 2160 

TAGTGAGGTT CACATTCATA CTAAGTTTTC AACATTGTGT TCTTTTTGCA TTCATTTTTT 2220 

50 

ACTTTTATTA AAGGTTCAAA ACC 2243 

55 

(2) INFORMATION FOR SEQ ID NO: 167: 



60 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1816 base pairs 

(B) TYPE: nucleic acid 
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<C) STRANDEENESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 

5 

GGTGGGNAGC TITNAATTTC CCCTTACWGG GGCGCWTAA GGGGAAACCT TCCCGGAATT 60 

TTCGQGTCGA CCCACGCGTC CGGCCAGCCT AGGAGAAGAA GTTCGTAGTC CCAGAGGTGA 120 

10 GGCAGGAGGC GGCAGTTTCT GGCGGGTGAG GGCGGAGCTG AAGTGACAGC GGAGGCGGAA 180 

GCAACGGTCG GTGGGGCGGA GAAGGGGGCT GGCCCCAGGA GGAGGAQGAA ACCCTTCCGA 240 

GAAAACAGCA ACAAGCTGAG CTGCTGTGAC AGAGGGGAAC AAGATGGCGG CGCCGAAGGG 300 

15 

GAGCCTCTGG GTGAGGACCC AACTGGGGCT CCCGCCGCTG CTGCTGCTGA CCATGGCCTT 360 

GGCCGGAGGT TCGGGGACCG CTTCGGCTGA AGCATTTGAC TCGGTCTTGG GTGATACGGC 420 

20 GTCTTGCCAC CGGGCCTGTC AGTTGACCTA CCCCTTGCAC ACCTACCCTA AGGAAGAAGA 480 

GTTGTACGCA TGTCAGAGAG GTTGCAGGCT GTTTTCAATT TGTCAGTTTG TGGATGATGG 540 

AATTGACTTA AATCGAACTA AATTGGAATG TGAATCTGCA TGTACAGAAG CATATTCCCA 600 

25 

ATCTGATGAG CAATATGCTT GCCATCTTGG KTGCCAGAAT CAGCTGCCAT TCGCTGAACT 660 

GAGACAAGAA CAACTTATGT CCCTGATGCC AAAAATGCAC CTACTCTTTC CTCTAACTCT 720 

30 GGTGAGGTCA TTCTGGAGTG ACATGATGGA CTCCGCACAG AGCTTCATAA CCTCTTCATG 780 

GACTTTTTAT CTTCAAGCCG ATGACGGAAA AATAGTTATA TTCCRGTCTA AGCCCAGRAA 840 

TCCCAGGTAC GCACCACATT TGGAGCCAGG AGCCCTACCA AATTTGRGRG RAWCMTCTCT 900 

35 

AAGCAAAATG TCCNTCAKMT CGSMAATGAG AAATTCACAA GCGCACAGGA ATTTTCTTGA 960 

AGATGGAGAA AGTGATGGCT TTTTAAGATG CCTCTCTCTT AACTCTGGGT GGATTTTAAC 1020 

40 TACAACTCTT GTCCTCTCGG TGATGGTATT GCTTTGGATT TGTTGTGCAA CTTGTTGCTA 1080 

CACGCTGTTG GACGCAGTAT AGTTTCCCTC TGAGAAGCTG AGTATCTATG GTGACTTGGA 1140 

GTTTATGAAT GAACAAAAGC TAAACAGATA TCCAGCTTCT TCTCTTGTGG TTGTTAGATC 1200 

45 

TAAAACTGAA GATCATGAAG AAGCAGGGCC TCTACCTACA AAAGTGAATC TTGCTCATTC 1260 

TGAAATTTAA GCATTTTTCT TTTAAAAGAC AAGTGTAATA GACATCTAAA ATTCCACTCC 1320 

50 TCATAGAGCT TTTAAAATGG TTTCATTGGA TATAGGCCTT AAGAAATCAC TATAAAATGC 1380 

AAATAAAGTT ACTCAAATCT GTGAAAAAAA AAAAAAAAAA AAAAAAAAAC TCGAGGGGGG 1440 

GCCCGTTACC AAKTCGCCCT ATOGTGADTB GTATTMTTAT TTTACTAATA TCTGTAGCTA 1500 

55 

TrTTGTTTTT KGCTTKGGTT ATKGTTTTTY TCCCTTYTCT WAGCTATRAG CTGATCATKG 1560 

CYSCTTCTCA CCTCCTGCCA TGATACTGTC AGTTACCTTA GTTAACAAGC TGAATATTTA 1620 

60 GTAGAAATGA TGCTTCTGCT CAGGAATGGC CCACAAATCT GTAATTTGAA ATTTAGCAGG 1680 
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AAATGACCTT TAATGACACT ACATTTTCAG GAACTGAAAT CATTAAAATT TTATTTGAAT 1740 
AATTATGTGC TGAAAAAAAA AAAAAAAAAA AMWMRARASK RRWWACTCGA GGGGGGGCCC 1800 
GGTACCCNAT TCGCCG 1816 



10 

(2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 945 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 168: 

20 

AGAAACCGTT GATGGGACTG AGAAACCAGA GTTAAAACCT CTTTGGAGCT TCTGAGGACT 60 

CAGCTGGAAC CAACGGGCAC AGTTGGCAAC ACCATCAACT TCTCCCAAGC AGAGAAACCC 120 

25 GAACCCACCA ACCAGGGGCA GGATAGCCTG AAGAAACATC TACACGCAGA AATCAAAGTT 180 

ATTGGGACTA TCCAGATCTT GTGTGGCATG ATGGTATTGA GCTTGGGGAT CATTTTGGCA 240 

TCTGCTTCCT TCTCTCCAAA TTTTACCCAA GTGACTTCTA CACTGTTGAA CTCTGCTTAC 300 

30 

CCATTCATAG GACCCTTTTT TTTTATCATC TCTGGCTCTC TATCAATCGC CACAGAGAAA 360 

AGGTTRACCA AGCTTTTGGT GCATAGCAGC CTGGTTGGAA GCATTCTGAG TGCTCTGTCT 420 

35 GCCCTGGTGG GTTTCATTAT CCTGTCTGTC AAACAGGCCA CCTTAAATCC TGCCTCACTG 480 

CAGTGTGAGT TGGACAAAAA TAATATACCA ACAAGAAGTT ATGTTTCTTA CTTTTATCAT 540 

GATTCACTTT ATACCACGGA CTGCTATACA GCCAAAGCCA GTCTGGCTGG AWCTCTCTCT 600 

40 

CTGATGCTGA TTTGCACTCT GCTGGAATTC TGCCTAGCTG TGCTCACTGC TGTGCTGCGG 660 

TGGAAACAGG CTTACTCTGA CTTCCCTGGG AGTGTACTTT TCCTGCCTCA CAGTTACATT 720 

45 GGTAATTCTG GCATGTCCTC AAAAATGACT CATGACTGTG GATATGAAGA ACTATTGACT 780 

TCTTAAGAAA AAAGGGAGAA ATATTAATCA GAAAGTTGAT TCTTATGATA ATATGGAAAA 840 

GTTAACCATT ATAGAAAAGC AAAGCTTGAG TTTCCTAAAT GTAAGCTTTT AAAGTAATGA 900 

50 

ACATTAAAAA AAACCATTAT TTCACTGTCA TTTAAAGATA ATGTG 945 



55 



(2) INFORMATION FOR SEQ ID NO: 169: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 902 base pairs 
60 (B) TYPE : nucleic acid 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 

5 

GGCAGAGCCA CAGGAAGGAT GAGGAAGACC AGGCTCTGGG GGCTGCTGTG GATGCTCTTT 60 

GTCTCAGAAC TCCGAGCTGC AACTAAATTA ACTGAGGAAA AGTATGAACT GAAAGAGGGG 120 

10 CAGACCCTGG ATGTGAAATG TGACTACACG CTAGAGAAGT TTGCCAGCAG CCAGAAAGCT 180 

TGGCAGATAA TAAGGGACGG AGAGATGCCC AAGACCCTGG CATGCACAGA GAGGCCTTCA 240 

AAGAATTCCC ATCCAGTCCA AGTGGGGAGG ATCATACTAG AAGACTACCA TGATCATGGT 300 

15 

TTACTGCGCG TCCGAATGGT CAACCTTCAA GTGGAAGATT CTGGACTGTA TCAGTGTGTG 360 

ATCTACCAGC CTCCCAAGGA GCCTCACATG CTGTTCGATC GCATCCGCTT GGTGGTGACC 420 

20 AAGGGTTTTT CAGGGACCCC TGGCTCCAAT GAGAATTCTA CCCAGAATGT GTATAAGATT 480 

CCTCCTACCA CCACTAAGGC CTTGTGCCCA CTCTATACCA GCCCCAGAAC TGTGACCCAA 540 

GCTCCACCCA AGTCAACTGC CGATGTCTCC ACTCCTGACT CTGAAATCAA CCTTACAAAT 600 

25 

GTGACAGATA TCATCAGGGT TCCGGTGTTC AACATTGTCA TTCTCCTGGC TGGTGGATTC 660 

CTGAGTAAGA GCCTGGTCTT CTCTGTCCTG TTTGCTGTCA CGCTGAGGTC ATTTGTACCC 720 

30 TAGGCCCACG AACCCACGAG AATGTCCTCT GACTTCCAGC CACATCCATC TGGCAGTTGT 780 

GCCAAGGGAG GAGGGAGGAG GTAAAAGGCA GGGAGTTAAT AACATGAATT AAATCTGTAA 840 

TCACCRGCTA AAAAAAAAAA AAAAAAAACN CGANCCTNGG TTTTCAGCTC CATCAGCTCC 900 

35 

TT 902 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 170: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1883 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

AGAAAACAAC TGAAAAACCA CATTTTTCTA CATACAGCTG GGGAGGTAGC TGAGAACTTG 60 

GCACTGCGCA CACATACTAG GTTGAAAGAG AGTTGAGGAA ACCAGAAGGC CAAGTGGATC 120 

55 TGCTGGCAAA CCCTGAACCT GTCTCCTGCG CTTGCTCTAC AGTTCTGAAG TTGAAAATCC 180 

TTTTCATGCC TAGCATCTGC TTGAGTTATA AACCCCAAGG CAGCCATGTC ATAGACTAGT 240 

GTTTACTCTT GTTTTGACTT TGTTTTAATG CTTCCTAAGA CCCAAGTGCC TCCTGCTGTT 300 

60 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



TCCTCCTTTG TGGTAGCCTC TGGCCATCTG GGACCTCAAT CCCCAGCTTT CCCACTTTCA 360 

GCAGTCCTTT GCTCTCTTTG CTTCTACCTC AAATAGCCCC AGGAGTGGGC TTTAGTCTCC 420 

AATATGGAGC ATYTCAAGCT TCTCCTGGGG GATGGGGATT GGGATGGGCA GAATCTGTTT 480 

TGGWTCTCCG GGTTATTTCC AGTGGGTGTA AAAGCAGAGC TGGGCCTTTC CCTCTCTTAT 540 

CCCTGAGGGT GGGTAAGAAG GACTGTATCT ACACCTGTTC TTCCCTACCT TCTCTTTO3T 600 

TAGGGAGGCC TCATTCTAAG TTCCTCAAGA GAGTCCTTGG CTTAAAGCTG TAGCAAGGGT 660 

GTGCTAGGTG GGGGATTTGG AGCAAAACCG TCGAGTAGGC ATGATACTGG TATGGAGTGG 720 

GCCTGCAAAA TCAGACAGAA ATGGCTTGAG AAGCCGCAGG GGAGCATGCC TGTCTCTCAG 780 

TGATAGAGTA TGGGAGGGAC CTCCCTAGCT TGGAAAATGA GAATTGAAGG GGTTATGAAC B40 

AAATAGGATG CCTAGTTGAG GATGTTCCCA AAGTTTTGTC CAATCTTATC ATTAGTAGAT 900 

TTTATAAGCC ACAGAGACAA ACCAGAAACG GAATAATGTT ACTTTGGATG CTTTATTTTT 960 

TTGTTCTAGG TGTGGCTTTG TACATGCAGA AGAATGCTAT ATGCTGCACA TTTTGCCTTT 1020 

AAAGTCTTAC GACTTTCCCC ATTTTAGTCT AATGGGAAGA TACAGATGTG CAAGTCTGCT 1080 

TTTTTGTTTT TTGTTATTAT TTTTTTTTTT TTGCTCTGTG TTATGGACAT TTTCAGAGAT 1140 

GCACAGAAGT GGAGAGGATG GTCCTTGGAC CCCATGTGTC CATCACCTAG CTGCATCACT 1200 

TATCAGCTAT GGTCAACCTG GTTTCATCTG TATCTCTCTC TTTTCACCTG TATTGTTTAT 1260 

TGAAAATCCA AGACACTATG CCAATGCAAC CGTGACTACT TTGQGAGATT GGTAGTCTCT 1320 

TTTGATGGTG ATAGTGATGG GGTGCACTAT CATAATCACA TCAGGTCTGC TTTTTGCTTT 1380 

TAATGTTAAC TAATGAAGTT CCAGAGATGG GCCTTAGAAA TGTGTTTTAA GAATTAACAA 1440 

GGAGTCTCAA AAAGAAATGA GAGGGATGCT TCCTTTCCCC TTCCATCTAC AAAACAAGAG 1500 

AGAGACTGTT CTGTTGTAAA ACTCTTTCAA AAATTCTGAT ATGGTAAGGT AOTTGAGACC 1560 

CTTCACCAGA ATGTCAATCT TTTTTTCTGT GTAACATGGA AACTTGTGTG ACCATTAGCA 1620 

TTGTTATCAG CTTGTACTGG TCTCATAACT CTGGTTTTGG AAGAATAATT TGGAAATTGT 1680 

TGCTGTGTTC TGTGAAAATA ACCTCCCCAA AATAATTAGT AACTGGTTGT TCTACTTGGT 1740 

AATTTGACAC CCTGTTAATA ACGCAATTAT TTCTGTGTTC TTAAACAGTA TAAATAGTTG 1800 

TAAGTTTGCA TGCATGATGG AAAAATAAAA ACCTGTATCT CTGTTAAAAA AAAAAAAAAA 1860 

AAAAAAAAAA AAAAAAAAAA AAA 1883 



60 



(2) INFORMATION FOR SEQ ID NO: 171: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2100 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

TACTTTTAGA TTTACTGCCT TCAAAAAGTG CCTATTCTGA GCAACATAAA CGTTATTCCT 60 

10 TACATATGTA TGTACACACG GTACCCAGAG TCCTACTGTG GCAGCCTTCA AAAACATACC 120 

ATCAGAAAGA GTAGGTGCTG AGATAAGGNA ACTTTGCCAA ATGNAAGAAA GTCACTCACT 180 

TCCAATATCC CCTCTTCAAG CGGCTACCGT GRAASGGGCT GCAAACACAT TCCCTGAGCA 240 

15 

TCCCTTGCTG ATACAGCTTC TTTATATTTA TATCCTACTG GATGGTAGCA TATTGCTAAG 300 

GTTTCCTGTA CTCTGCTTCA AGGGAATGTA AGYTTTATGG CATTGAAACA TTTAGGAAAA 360 

20 AAAAAGATGT TTAAGAGAAT TAATAGAGCC GTAGTCTGTA TTAGGATGTG TGTCATATGT 420 

GTGTTCTATA AACTAAGCAT CGGTGGGTTT AGAGTGTTAA AGTGTCAGCA CATTCCTTCT 480 

CCTTTTGTCT CTCAGGCTAA CATGAGAGAA AATAGAAAAG TCTTGGCTGT GGGGATTGGA 540 

25 

AGCTCAGGGG GCCAAATGTC CTTGCCAGAT CCTTAGAGCA TTACTTTGAC TCCTAAAAAT 600 

AGTAGTGTAT GTTATTTGAT GGCTTTTGTT TCCATAGTTC CATCACTGAC AAAACTGTCA 660 

30 ATACTGTTGA TGGAGCAGCA GCATAGCCTA GAGTGATGCA TTCTTACCCA GAGGTGGCAA 720 

TAGGAGAGGG TCCATGTAAA TAGGACGAGG TAGACAGTGC ATGATTGTAG GAGAAGGGTT 780 

GAAGGGAGGA CATGATTCCA AAAAAGATCG TTCTCAATGT GTCGTCTGAC TCAACCAGCT 840 

35 

GGCAGATTAC ACTTGCCAAG TCGTTCCCTT TCCTTCTAAG TCAGTTGGCT CCATATTCAC 900 

TTGAATATGC CTCTGTTTGG GCAAAGCAAG ATACCTCCAC TTAACCTTTA TCCAAGGAAG 960 

40 CTCTTGGTGT CCTCTTGGTC ATAAAGTTGT CTCCTACCTA ACCCAGTTTT ACCAAATGGA 1020 

AGTAAAAGGG GACAAACTAT GGAAGATGGA CTCCATGCCA TTGCAGTCAG CCACCATTCT 1080 

CTTTTCCATA TAAGGAGCCC CATTACATAA GCTACGGGTG AGGTTGGAAC AGCTATGTTT 1140 

45 

CATAATTTCA AGAGTGTGAC CACCCTGCTC TAGTCATCAT CATTGGATGA ATCCAGTTGA 1200 

CTCTTTGGCA AAAGGGTGAT ACTTTTCACT AAAAATGCCT ACTCTTCCTG TTGATGTTCC 1260 

50 TTTTCTGTTT TTACCTTGTC CAATTTCCAC ACTAGTCATT TTTTTTATTT TTTAGAGGAT 1320 

CAGAITTTAG CGCTGGAAAA TGAGTTCAAA AATTTCAGTG TAATGTCATA AGGATGTTGG 1380 

GATACAGAGA TTTTTTTTTT CCTTGGAAAC AAATGGACTG GGAAGAAACA CAGCATGGCT 1440 

55 

TTGCTCTGAG TTTCAATCTG ATGATTATGA CCATGGAAGA TAGTCTTATG TAAAGGTTAA 1500 

ATGGTGTTTA CAAGTGGATA GATAAGGCGG AGATGGTGAG AAGCCGGGTT TTCTCTATGC 1560 

60 TAAATGTGTC TACTAAGAGC AGCACTTCCT ACTAGCTAAG CACAATCATA GCCCCACCGT 1620 
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GATGAGCTGC TAGTCTGAAT AACATTCCCT GACTTAGGGA AAGGCACACA AAAACATATA 1680 

AAGAATATGT CTATTTTCAT ATGTGTGATA CTGACAGAGC CATGGTATTC CTAAAATATA 1740 

5 

GGTTTCTCTT TTTTCTTGTA TTCTTAGCAA ATTGCATTTA TTCACTACAT TACAAACCAT 1800 

CACTGATGTA TCCAAAATAG CACACATAGT TCAGTATGAA AATAAGAGAA TAAAATCTGT 1860 

10 TATAAGCAAG TGATTTAGGT A'lTTTCTTTT GTGTTTATGC ATTATCTGAC TATATTAAAA 1920 

CCTGTTTTTC TATTTACCTT CTATCAGTTT TCTCTACCAA TTATGTTTTT TCAATGCTCT 1980 

ATAAGAATGA ATATGGAAAT TATATTTCTT TTTTCTGTAA AAGAGTTGCA ACTACTTTAT 2040 

15 

TATATTTAGA AATCCAATAA ACTTCTTATT ACATTTAAAA AAAAAAAAAA AAAACTCGAA 2100 



20 



30 



(2) INFORMATION FOR SEQ ID NO; 172: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1930 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

CCTTTGANTG TGGTCCCGGG TGCNGATTGG CAGCGCCTCC GCCGCGGCTC GTGGTTGTCC 60 

CGCCATGGCA CTGTCGCGGG GGCTGCCCCG GGAGCTGGCT GAGGCGGTGG CCGGGGGCCG 120 

35 GGTGCTGGTG GTGGGGGCGG GCGGCATCGG CTGCGAGCTC CTCAAGAATC TCGTGCTCAC 180 

CGGTTTCTCC CACATCGACC TGATTGATCT GGATACTATT GATGTAAGCA ACCTCAACAG 240 

ACAGTTTTTG TTTCAAAAGA AACATOTTGG AAGATCAAAG GCACAGGTTG CCAAGGAAAG 300 

40 

TGTACTGCAG TTTTACCCGA AAGCTAATAT CGTTGCCTAC CATGACAGCA TCATGAACCC 360 

TGACTATAAT GTGGAATTTT TCCGACAGTT TATACTGGTT ATGAATGCTT TAGATAACAG 420 

45 AGCTGCCCGA AACCATGTTA ATAGAATGTG CCTGGCAGCT GATGTTCCTC TTATTGAAAG 480 

TGGAACAGCT GGGTATCTTG GACAAGTAAC TACTATCAAA AAGGGTGTGA CCGAGTGTTA 540 

TGAGTGTCAT CCTAAGCCGA CCCAGAGAAC CTTTCCTGGC TGTACAATTC GTAACACACC 600 

50 

TTCAGAACCT ATACATTGCA TCGTTTGGGC AAAGTACTTG TTCAACCAGT TGTTTGGGGA 660 

AGAAGATGCT GATCAAGAAG TATCTCCTGA CAGAGCTGAC CCTGAAGCTG CCTGGGAACC 720 

55 AACGGAAGCC GAAGCCAGAG CTAGAGCATC TAATGAAGAT GGTGACATTA AACGTATTTC 780 

TACTAAGGAA TGGGCTAAAT CAACTGGATA TGATCCAGTT AAACTTTTTA CCAAGCTTTT 840 

TAAAGATGAC ATCAGGTATC TGTTGACAAT GGACAAACTA TGGCGGAAAA GGAAACCTCC 900 

60 
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AGTTCCGTTG GACTGGGCTG AAGTACAAAG TCAAGGAGAA GAAACGAATG CATCAGATCA 960 

ACAGAATGAA CCCCAGTTAG GCCTGAAAGA CCAGCAGGTT CTAGATGTAA AGAGCTATGC 1020 

5 ACGTCTTTTT TCAAAGAGCA TCGAGACTTT GAGAGTTCAT TTAGCAGAAA AGGGGGATGG 1080 

AGCTGAGCTC ATATGGGATA AGGATGACCC ATCTGCAATG GATTTTGTCA CCTCTGCTGC 1140 

AAACCTCAGG ATGCATATTT TCAGTATGAA TATGAAGAGT AGATTTGATA TCAAATCAAT 1200 

10 

GGCAGGGAAC ATTATTCCTG CTATTGCTAC TACTAATGCA GTAATTGCTG GGTTGATAGT 1260 

ATTGGAAGGA TTGAAGATTT TATCAGGAAA AATAGACCAG TGCAGAACAA TTTTTTTGAA 1320 

15 TAAACAACCA AACCCAAGAA AGAAGCTTCT TGTGCCTTGT GCACTGGATG CTCCCAACCC 1380 

CAATTGTTAT GTATGTGCCA GCAAGCCAGA GGTGACTGTG CGGCTGAATG TCCATAAAGT 1440 

GACTGTTCTC ACCTTACAAG ACAAGATAGT GAAAGAAAAA TTTGCTATGG TAGCACCAGA 1500 

20 

TGTCCAAATT GAAGATGGGA AAGGAACAAT CCTAATATCT TCCGAAGAGG GAGAGACGGA 1560 

AGCTAATAAT CACAAGAAGT TGTCAGAATT TGGAATTAGA AATGGCAGCC GGCTTCAAGC 1620 

25 AGATGACTTC CTCCAGGACT ATACTTTATT GATCAACATC CTTCATAGTG AAGACCTAGG 1680 

AAAGGACGTT GAATTTGAAG TTGTTGGTGA TGCCCCGGAA AAAGTGGGGS CCAAACAAGC 1740 

TGAAGATGCT GOCAAAAGCA TAACCAATGG GCAGTGATGA TGGGAGCTTC AGCCCTCCAC 1800 

30 

CTYCACAGCT TCAAGGAGGC AAGATGGACG TYTCYCATAG TTGATYCGGR TGAAGAAGRT 1860 

TCTCCAATAA TTGCCCGACG TTCATTGAAG GAAGGAGGAG GAGGCCCGCC AAGAGGGGAA 1920 

35 TTTAGGNTTG 1930 



40 (2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1509 base pairs 
<B) TYPE: nucleic acid 
45 (C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

50 GGCCCTGGCC TCTGGGCTGA GGCTTGCTAG GGACTCGGGG TGGCTCTAAG GGGCAGGGAT 60 

AGGGCTGGGG AGCGCCGGCC TGTGGCCCTG ACCAGCCCCT TCTCGTGCRG GTTCCACCCC 120 

GATGCAGGTG GTCACGTGCT TGACGCGGGA CAGCTACCTG ACGCACTGCT TCCTCCAGCA 180 

55 

CCTCATGGTC GTGCTGTCCT CTCTGGAACG CACGCCCTCG CCGGAGCCTG TTGACAAGGA 240 

CTTCTACTCC GAGTTTGGGA ACAAGACCAC AGGGAAGATG GAGAACTACG AGCTGATCCA 300 

60 CTCTAGTCGC GTCAAGTTTA CCTACCCCAG TGAGGAGGAG ATTGGGGACC TGACGTTCAC 360 
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TGTGOCCCAA AAGATGGCTG AGCCAGAGAA 
GCAGGCCTTC CAGGTGGGCA TGCCACCCCC 

5 

GACACTCCTG CTCACCAGCT CCGAGATCTT 
ACTGCCCGAG TTTGCCAAAG AGCCGCCGCA 
10 CCGCGTCCGG GACCTGGACC GAGTGCTCAT 
CTCGTCTTCG ATGACGTGCA AGGTCATGAC 
GGGGAGGTGC CAGGTGGCCC GGCTAGAGCC 

15 

TTTGTCCCCA GTGCTGAGAG CAGAGAGAAG 
GCCCTGTGTG GCCGTGAGCT GCCTGTCGAG 
20 TGTCGTGTCC AGCCTGACGC CTACTGGGGC 
TGTTTTATCC TCCCTTTGGT ACCTTAATTT 
TGTGTGTTGT GTTAATTCTT TCTCATGTTG 

25 

TCGGTGTGCT GTCAGCCTCC CACAGGTGGT 
GTTGTGGGAC CGTTGTTAAC ACGTGACACT 
30 TTCCTGAAGT GTCGAGTCCA GTCCTTTGTT 
GGCATCTTGC TGCTAATCCT GAGGCTGGTA 
TATTGTTCTT CAAAGTGGAG GTCTCCCCTG 

35 

AGGGGACCTG GAGCTGCCAG CACCAAGCGT 
TAAAGCAGAG TTTGACACCG TCAAAAAAAA 
40 CCTCAAGGG 



GGCCCCAGCC CTCAGCATCC TGCTGTACGT 420 

TGGGTGCTGC AGGGGCCCCC TGCGCCCCAA 480 

CCTCCTGGAT GAGGACTGTG TCCACTACCC 540 

GAGAGACAGG TACCGGCTGG ACGATGGCCG 600 

GGGCTACCAG ACCTACCCGC AGCCCTCACC 660 

CTCATGGGCA GTGTCACCCT GGACCACTTT 720 

AGCCAGGGCC GTGAAGTCCA GTGGCAGGTG 780 

CTCATCTCGC TGTTGGCTCG CCAGTGGGAG 840 

CTCACCGGCT AGCCCAGGCC ACAGCCAGCC 900 

AGGGCAGCAG GCTTTTGTGT TCTCTAAAAA 960 

GACTGTCCTC GCAGAGAAPG TGAACATGTG 1020 

GGAGTGAGAA TGCCGGGCCC CTCAGGGCTG 1080 

ACAGCCGTGC ACACCAGTGT CGTGTCTGCT 1140 

GTGGGTCTGA CTTTCTCTTC TACACGTCCT 1200 

GCTGTTGCTG TTGCTGTTGC TGTTGCTGTT 1260 

GCAGAATGCA CATTGGAAGC TCCCACCCCA 1320 

ATCCAGACAA GTGGGAGAGC CCGTGGGGGC 1380 

GATTCCTGCT GCCTGTATTC TCTATTCCAA 1440 

AAAAAAAAAA AAAAAAAAAA ATTNCTGCGG 1500 

1509 



45 (2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 3173 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 
55 TCGACCCCAS GCGTCCGTGC TTTTCCACAG AAGGTTAGAC CCTGAAAGAG ATGGCTCAGC 60 
ACCACCTATG GATCTTGCTC CTTTCCCTGC AAACCTGGCC GGAAGCAGCT GGAAAAGACT 120 
CAGAAATCTT CACAGTGAAT GGGATTCTGG GAGAGTCAGT CACTTTCCCT GTAAATATCC 180 

60 



WO 98/39448 



403 



PCT7US98/04493 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



AAGAACCACG GCAAGTTAAA ATCATTGCTT GGACTTCTAA AACATCTGTT GCTTATGTAA 240 

CACCAGGAGA CTCAGAAACA GCACCCGTAG TTACTGTGAC CCACAGAAAT TATTATGAAC 3 00 

GGATACATGC CTTAGGTCCG AACTACAATC TGGTCATTAG CGATCTGAGG ATGGAAGACG 360 

CAGGAGACTA CAAAGCAGAC ATAAATACAC AGGCTGATCC CTACACCACC ACCAAGCGCT 420 

ACAACCTGCA AATCTATCGT CGGCTTGGGA AACCAAAAAT TACACAGAGT TTAATGGCAT 480 

CTGTGAACAG CACCTGTAAT GTCACACTGA CATGCTCTGT AGAGAAAGAA GAAAAGAATG 540 

TGACATACAA TTGGAGTCCC CTGGGAGAAG AGGGTAATGT CCTTCAAATC TTCCAGACTC 600 

CTGAGGACCA AGAGCTGACT TACACGTGTA CAGCCCAGAA CCCTGTCAGC AACAATTCTG 660 

ACTCCATCTC TGCCCGGCAG CTCTGTGCAG ACATCGCAAT GGGCTTCCGT ACTCACCACA 720 

CCGGGTTGCT GAGCGTGCTG GCTATGTTCT TTCTGCTTGT TCTCATTCTG TCTTCAGTGT 780 

TTTTGTTCCG TTTGTTCAAG AGAAGACAAG ATGCTGCCTC AAAGAAAACC ATATACACAT 840 

ATATCATGGC TTCAAGGAAC ACCCAGCCAG CAGAGTCCAG AATCTATGAT GAAATCCTGC 900 

AGTCCAAGGT GCTTCCCTCC AAGGAAGAGC CAGTGAACAC AGTTTATTCC GAAGTGCAGT 960 

TTGCTGATAA GATGGGGAAA GCCAGCACAC AGGACAGTAA ACCTCCTGGG ACTTCAAGCT 1020 

ATGAAATTGT GATCTAGGCT GCTGGGCTGA ATTCTCCCTC TGGAAACTGA GTTACAACCA 1080 

CCAATACTGG CAGGTTCCCT GGATCCAGAT CTTCTCTGCC CAACTCITAC TGGGAGATTG 1140 

CAAACTGCCA CATCTCAGCC TGTAAGCAAA GCAGGAAACC TTCTGCTGGG CATAGCTTGT 1200 

GCCTAAATGG ACAAATGGAT GCATACCCTT CCTGAAATGA CTCCCTTCTG AATGAATGAC 1260 

AAAGCAGGTT ACCTAGT ATA GTTTTCCCAA ACTTCTTCCC ATCATAGCAC ATGTAGAAAA 1320 

TAATATTTTT ATGGCACACT GGGATAAACA AGCAAGATTG CTCACTTCTG GAAGCTGCAT 1380 

ATGACTAGAG GCCTCTTGTG ACTGGAGGTA ACAACCCTGC CCAGTAACTG TGGGAGAAGG 1440 

GGATCAATAT TTTGCACACC TGTAATAGGC CATGGCACAC CAGCCAAGAT GCTCTGCTCA 1500 

CAGTCAGTAT GTGTGAAGAT CCCTGGTGCG TGGCCTTCAC CACGCATXTTT GAGCAAATTA 1560 

GGAAAATGTA CCCTTCGCTT GAGGCAGATG CAGCCCTTCC CCCGAGTCCA TGGCTTGGAG 1620 

AGCAGAATGT GGGCTGCATA TAAGCACACT CATCCCTTTG TCTGGGAATC TTTGTGCAGG 1680 

GCATAACAGG CTTAGTAAGT CCAAACACAG ATGACAGTGC TGTGTGGGTC TCTGTCAGAG 1740 

TTGTGGCTCT CAGCCATGTA GACACACTCT CCAAATGGAG TGTTGGAAAA TGTTCTTTCT 1800 

GCAGGGTCTA GAGACTGCTG GGACACTTTT CTTGGAGTGC TACTTCAGAA GCCTTATAGG I860 

ATTTTCTTTC TGGCCAAGAT TTCCTTCTGT ATCACTCCAA GCAGCCTCAG CAGAAGAAGC 1920 

AGCCATGCCC AGTATTCCCA CTCTCCAAAA GGAACTGACC AGCTTATATT TCTCACACTT 1980 
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CTGGGGAACT GGGTATAATC CAACCATCAA AATAGAAGAC CTTGCAAGAA GCAGAGTCAT 
TCTCCAGAAG GAACTTGGGA GATGATGGTG CAGATGATGA AACTGGGTTC ATCCCAGTTC 
CAAAGACTCA GAGAAOTAGA GTTTAAGCTG AGGCAGAGTG CCGCCACCCT GGCATGCCCC 
ACAAACAGAT CACCACCCAC CTTACACAGG CATTAACTCT CCTCAATGAG GAAGAATCAT 
TCACAACTGA GCAAGACATT CATATGATCA TTTAAGGAAG TGTTTCCCIT ATGTGTTAGC 
AAGTATAATC GGCTAACTCC TAAATCCCAA TGAATAGTCC TAGGCTGGAC AGCAATGGGC 
TGCAATTAGG CAGATAAAGA CATCAGTCCC AGTAAATGAA TCCATAGACT CATCTAGCAC 
CAACTACCAT TAGCACTATG TTAGGAGCTG CAAGGCCCCA AAGTAGAAGA TGTGCATAAT 
GTCTGCTCTT GTGTAGCTCA GGAGACAATT CCAGCACAGA CACTACAGTT AACGCTGAAC 
TGCAGCTGCA AGTAATAGCA TGAACAGTCA GAAAAATACC TTATGAGGGG GCAGGGCTGA 
AGCTGGGCCT TGAAGGATGG ATGAAATTTG GATAGAGAAT GAGGAAGACA GAGGGCCTCC 
AAGTGAGAGA AGCATGAAAA ATGAGCAGGG GCCTGGATCA GTGGGGTGTA TTCAGAGCAC 
CTCTCCAGAT GCACCATGCA TGCTCACAGT CCCTTGCCTA TGTGTGGCAG AGTCTCCCAG 
CCAGATGTGT GCCCCCACCC CATGTCCATT TACATGTCCT TCAATGCCCA CCTCAAAAGG 
TACCTCTTCT GTAAAGCTTT CCCTGGTATC AGGAATCAAA ATTAATCAGG GATCTTTTCA 
CACTGCTCTT TTTTCCTCTT TGGTCCTTCT ATCACTAAAA CTCATCTCAT TCAGCCTTAC 
AGCATAACTA ATTATTTGTT TTCCTCACTA CATTGTACAT GTGGGAATTA CAGATAAACG 
GAAGCCKGCT GGGGTGGTGG CTCACGCCTG TAATCCCAAC ACTTTGGGAG GCCAAGGCAG 
GCGGATCACC TGAGGTCAGG ARTTCGAGAT TARTCTGGCC AACATGGTGA AACCCCATNT 
NTACTAAAAA TACGAAATTA GCCAGGTGTG GTGGCACACA TCTGTAGTCC CAG 
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(2) INFORMATION FOR SEQ ID NO* 175- 

45 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 991 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 

*V (D) TOPOLOGY: linear 



55 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 


: 175: 






AAATTCGGCA CAGCTGAGAG GAGACACAAG GAGCAGCCCG 


CAAGCACCAA 


GTGAGAGGCA 


60 


TGAAGTTACA GTGTGTTTCC CTTTGGCTCC TGGGTACAAT 


ACTGATATTG 


TGCTCAGTAG 


120 


ACAACCACGG TCTCAGGAGA TGTCTGATTT CCACAGACAT 


GCACCATATA 


GAAGAGAGTT 


180 


TCCAAGAAAT CAAAAGAGCC ATCCAAGCTA AGGACACCTT 


CCCAAATGTC 


ACTATCCTGT 


240 
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(2) INFORMATION FOR SEQ ID NO: 176: 

( l ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1290 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 
ACAGCCCTCT TCGGAGCCTG AGCCCGGCTC TCCTCACTCA CCTCAACCCC CAGGCGGCCC 
CTCCACAGGG CCCCTCTCCT GCCTGGACGG CTCTGCTGGT CTCCCCGTCC CCTGGAGAAG 
45 AACAAGGCCA TGGGTCGGCC CCTGCTGCTG CCCCTRCTGC YCCTGCTGCW GCCGCCAGCA 



35 



40 



50 



420 
480 
540 



CCACATTGGA GACTCTGCAG ATCATTAAGC CCTTAGATGT GTGCTGCGTG ACCAAGAACC 300 

5 TCCTGGCGTT CTACGTGGAC AGGGTGTTCA AGGATCATCA GGAGCCAAAC CCCAAAATCT 360 
TGAGAAAAAT CAGCAGCATT GCCAACTCTT TCCTCTACAT GCAGAAAACT CTGCGGCAAT 
GTCAGGAACA GAGGCAGTGT CACTCCAGGC AGGAAGCCAC CAATGCCACC AGAGTCATCC 
10 ATGACAACTA TGATCAGCTG GAGGTCCACG CTGCTGCCAT TAAATCCCTG GGAGAGCTCG 

ACGTCTTTCT AGCCTGGATT AATAAGAATC ATGAAGTAAT GTCCTCAGCT TGATGACAAG 600 

^ GAACCTGTAT AGTGATCCAG GGATGAACAC CCCCTGTGCG GTTTACTGTG GGAGACAGCC 660 

CACCTTGAAG GGGAAGGAGA TGGGGAAGGC CCCTTGCAGC TGAAAGTCCC ACTGGCK5GC 720 

CTCAGGCTGT CTTATTCCGC TTGAAAATAG CCAAAAAGTC TACTGTGGTA TTTGTAATAA 780 

20 ACTCTATCTG CTGAAAGGGC CTGCAGGCCA TCCTGGGAGT AAAGGGCTGC CTTCCCATCT 840 

AATTTATTGT GAAGTCATAT AGTCCATGTC TGTGATGTGA GCCAAGTGAT ATCCTGTAGT 900 

ACACATTGTA CTGAGTGGTT TTTCTGAATA AATTCCATAT TTTACCTAAA AAAAAAAAAA 960 

AAAAACTCGA GGGGGGGCCC GTACCCAATT T 991 



60 
120 
180 



TTTCTGCAGC CTRGTGGCTC CACAGGATCT GGTCCAAGCT ACCTTTATGG GGTCACTCAA 240 

CCAAAACACC TCTCAGCCTC CATGGGTGGC TCTGTGGAAA TCCCCTTCTC CTTCTATTAC 300 

CCCTGGGAGT TAGCCAYAGY TCCCRACGTG AGAATATCCT GGAGACGGGG CCACTTCCAC 360 

GGGCAGTCCT TCTACAGCAC AAGGCCGCCT TCCATTCACA AGGATTATGT GAACCGGCTC 420 

55 TTTCTGAACT GGACAGAGGG TCAGGAGAGC GGCTTCCTCA GGATCTCAAA CCTGCGGAAG 480 

GAGGACCAGT CTGTGTATTT CTGCCGAGTC GAGCTGGACA CCCGGAGATC AGGGAGGCAG 540 

CAGTTGCAGT CCATCAAGGG GACCAAACTC ACCATCACCC AGGCTGTCAC AACCACCACC 600 

60 
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ACCTGGAGGC CCAGCAGCAC AACCACCATA GCCGGCCTCA GGGTCACAGA AAGCAAAGGG 660 

CACTCAGAAT CATGGCACCT AAGTCTGGAC ACTGCCATCA GGGTTGCATT GGCTGTCGCT 720 

GTGCTCAAAA CTGTCATTTT GGGACTGCTG TGCCTCCTCC TCTGTGGTGG AGGAGAAGGA 780 

AAGGTAGCAG GGCGCCAAGC AGTGACTTCT GACCAACAGA GTGTGGGGAG AAGGGATGTG 840 

TATTAGCCCC GGAGGACGTG ATGTGAGACC CGCTTGTGAG TCCTCCACAC TCGTTCCCCA 900 

TTGGCAAGAT ACATGGAGAG CACCCTGAGG ACCTTTAAAA GGCAAAGCCG CAAGGCAGAA 960 

GGAGGCTGGG TCCCTGAATC ACCGACTGGA GGAGAGTTAC CTACAAGAGC CTTCATCCAG 1020 

15 GAGCATCCAC ACTGCAATGA TATAGGAATG AGGTCTGAAC TCCACTGAAT TAAACCACTG 1080 

GCATTTGGGG GCTGTTYATT ATAGCAGTGC AAAGAGTTCC TTTATCCTCC CCAAGGATGG 1140 

AAAATACAAT TTATTTTGCT TACCATACAC CCCTTTTCTC CTCGTCCACA TTTTCCAATC 1200 

20 

TGTATGGTGG CTGTCTTCTA TGGCAGAAGG TTTTGGGGAA TAAATAGCCT GAMATGI7TNC 1260 

TGACTNAAAA AAAAAAAAAA AAAAACTCGA 1290 

25 

(2) INFORMATION FOR SEQ ID NO: 177: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2290 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

TGGGGCCCCT TTTGGATGCT CTGGGTGTTT TTGCCAAGAG TTACAGGATG TCAAGTGTGG 60 

40 GGAGCTCAGC ACCCTTGCTG TGGACCAGTG AAGGCTGTTC CAGACCAGGT GCTTCCAGAC 120 

ATTTCCAGGC TCCAGGAGAG AGGCTGGGAG CCCCCACAGA AAGCACAGGA AAATGCAAAA 180 

AAAAAACAGT CTTTTTTTTT TTTTTGCTTT TTATTATGAA AACAAAACAA ATGCCCCAGG 240 

45 

AGAAGGGTCC ATGATTACCA GAAACATCAA AGAGTACTTT CTACCATTTT TATTCTGTTG 300 

TGTTGAGGCC AGCATTGCAA TAAACAAGCT AAACTACTTA CATTGGACTC ATTTTCAGTA 360 

50 ACTGACATTT ACAGGAATAT ACTAGAAACG GCACTAAAAA GTTTAAGAAA AGTTACGGTA 420 

AACTTGCATG CACATCATAC AGAAAAGTAA CATTTTAAAT ATAAAAAAGA AAAACTTCCT 480 

GGAAGCATTA TGCCAGTATT AAGGAACAGT GCTACTCTGG ATGTGACAAA TTCTGTATGT 540 

55 

GGGTGTTACT CTTTCCCAAA AGACTGTCAG AGGCGTGAGT GCTGCAAAAG AACAACAACA 600 

AAAACAAACA CACAAAAAAA TGTGTCTTAC AGTTTGTAAG CAAGATGACA CTGCCCAACA 660 

60 CAAAGAGGGG TCTGGAGTTC AGTTCACGCC CGAAGCCTGC CCCCTCGGCC TCCAGGGGTC 720 
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ATTCAGAGTG TTCTCAAATC CAATTCCGAC ACACGACTTG TCACTACTCC TCTCCCCTTG 780 
AAAAAAGCAT GTTAGAAGCT GCCCTACAGG TCTCAGCAGT GGGACAATCT AATTGAATCA 840 
CCGCAGCCT? CTAATACAGA AGAAACGGAC GTGACTGTCA CCCTCAGCCC GCCAGCAAGG 900 
GCGCTGAGGA AGTCATTAAT CCTTCGAAAC TCTGAAAAGA AACCAGTGTT GAAGTCTGGA 960 
CAGAAAGCCT TAAAAAAGTG ACAGCACCAA TGCAGCTGCT CAGTGTACCC NCCGTGGGCT 1020 
GTCAGGGTCA GTGGCTTCTT TCTAGATGAA AGGAGCAGAG GCGAGCCGAC GCCACCGTCA 1080 
CAGAGAACCA GCCGAGAAGG AAAGGCCCCA CGATGCTCCC TGTGCGCTGC CCCCACAGCC 1140 

GGCCGCTCCC CCGACGGCTC ACACAGGCAG CACCTCACTG CCCTGTGGCT GGAGGGGCAT 1200 

TGCAAGGAGC GCCCCCCAGC CCCAGGCACC CCCGGCTTAG GGTGTACGTA TCACCCAGCC 1250 

CTGTGCTGGC AGCACGTTAC CAACCAGCCT GCGTGAAGAC CTGTCAACTG TCGTCTGTGA 132 0 

ATTCCTTAAA TTCGGTTTAA ATAGTCCATT AAAGATCTGT TTAGAAAATA CCTTTGAAAA 1380 

CGAGGGTAAC TTTAAAAAAT GGAAACTITC AAATCCATTT ATATTTTTAT TATAAACAAA 1440 

ACTTAATTAA AAGTTTAACA AACTGGCTGA AAACTCACCA AGTGTCAGAC TCACCAGCAA 1500 

TTTAAAAAAT GATAATTTAC CAGCATCTCC TCATCAGAGT TCCCTCTCCA GTAAGGGTAT 1560 

ACCTACATCT GTAAGGGTCA GTGGACTCTG AATCAATTTT ATGGTTGTTT TAAAATCACC 1620 

GTGTATTAGG ATACTAATGA TAGTCCCTAT ATCCATCCAG AAATGCTGGC AGAAAGCACT 1680 

GGCCACCATA CAGGACAGAC CACACCACAG CTCCATACCC AGCGTCTGCC TGGAGGCTCC 1740 

CCCACGCTGA GGTCCGGGAG AATGCCTGGT TTCAGTCATT TCCGGACTAA CTGTGACAAC 1800 

GCGTGAGCAG GGAGCACCGT GCGAGTCTCC GGGAGGGAAT CCTCCTGGGG CCCAGAGACT I860 

CCTCCACCCC TGGGGAGGGC AGACAGGCTC GGGARGGCCT GGCCAGGCCA CTGGAGGCTG 1920 

GCAGGGAGCA GGCATGTCCA CCCGCAAGCC TGGGAGGCTA ACTCTGGCAT TCCTGGCCGG 1980 

AGCCGCCATG CTCATTGGTG GGCCAGTTTG GGACATCCCC GTACTCAAAG ACCATATGGC 2040 

AGCCTCTGGG AAAACAAAAC CAAAACATCA CCTTCTATTA AACTCTGTAT ATTATTATTT 2100 

TTTACAATAG AAAGTTAAAA ATCAAGACTT AGATTTACTA TACATTTTTT CTCTCAGATT 2160 

ACAAAGTTTA TATTATATAA CTGGGGTTCC CTAAATTGAT TTCTTTTAAA ACAGTCTTAA 2220 

AGAGACCAGA AGTGAATACA AAAGAACTAA ACAAAATAAA AAATTAGAAT GTGCTGTAGC 2280 

TOAAAGCTGT 2290 



(2) INFORMATION FOR SSQ ID NO- 178* 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 549 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEDNESS : double 

(D) TOPOLOGY : linear 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

GGCACGAGCC ATGCCTGGCC TCTCCTTGAT TCTTACAGTC ACTTTGTTGG CTGTTTCTGA 60 

CTCAGCAGCT ACCTGCATTG TGGCCAAAGG ATGACCTATT CCTTCTCAGG AGQGCAAAAA 120 

TGTGGAATAG TGTCTGTCCA TGCCTCTCCT CATGGGCTAC CACCTCTGCC ACCGTGGTTA 180 

15 ATCAGTAACA ACCAGGAGAG AAGCTGCTGG AACTGACCTC TGGGAACTCC CTGGGATGGT 240 

TTGGTGCAGG AATGTAGTAG GCATACACGT GGTTGCGTGG ATCTGGGCCC TCCTGATGTG 300 

AGTAGAGAGG TAAAAGGCCA CCATCTCCTT GACCTCTGGG GAACTCATCC ACAAAGAAGA 360 

20 

TGTTTCCAAG ATGCTTCTGA AGATTGCCTA AAAATAGCCG GTTTCCACCC CCGTGAATGC 420 

ATCCATTCTA GAATGCTCCT TCACCAGGAC CAGAGAACTG ATTTACAGAA GTGACATGAA 480 

25 AACATTCCAT CCCAGAATTT GCAGTAGCTC AAATTAAGTT TCTAGCTATT AAAAAGAAAA 540 

AAAAAAAAA 549 



30 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 179: 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 1509 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

GGCACGAGGG CTCATTCATT CCGCGCCGGG CCTGCCAGAC ACCTGCGCCC TTCTGCAGCC 60 

GCCCGCCGCA TCCGCCGCCG CAGCCCCCAG CATGTCGGGC CCAGACGTCG AGACGCCGTC 120 

CGCCATCCAG ATCTGCCGGA TCATGCGGCC AGATGATGCC AACGTGGCCG GCAATGTCCA 180 

CGGGGGGACC ATCCTGAAGA TGATCGAGGA GGCAGGCGCC ATCATCAGCA CCCGGCATTG 240 

50 CAACAGCCAG AACGGGGAGC GCTGTGTGGC CGCCCTGGCT CGTGTCGAGC GCACCGACTT 300 

CCTGTCTCCC ATGTGCATCG GTGAGGTGGC GCATGTCAGC GCGGAGATCA CCTACACCTC 360 

CAAGCACTCT GTGGAGGTGC AGGTCAACGT GATGTCCGAA AACATCCTCA CAGGTGCCAA 420 

AAAGCTGACC AATAAGGCCA CCCTGTGGTA TGTGCCCCTG TCGCTGAAGA ATGTGGACAA 480 

GGTCCTCGAG GTGCCTCCTG TTGTGTATTC CCGGCANGAG CAGGAGGAGG AGGGCCGGAA 540 

60 GCGGTATGAA GCCCAGAAGC TGGAGCGCAT GGAGACCAAG TGGAGGAACG GGGACATCGT 600 
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CCAGCCAGTC CTCAACCCAG AGCCGAACAC TGTCAGCTAC AGCCAGTCCA GCTTGATCCA 660 

CCTGGTGGGG CCTTCAGACT GCACCCTGCA CGGCTTTGTG CACGGAGGTG TGACCATGAA 720 

5 

GCTCATGGAT GAGGTCGCCG GGATCGTGGC TGCACGCCAC TGCAAGACCA ACATCGTCAC 780 

AGCTTCCGTG GACGCCATTA ATTTTCATGA CAAGATCAGA AAAGGCTGCG TCATCACCAT 840 

10 CTCGGGACGC ATGACCTTCA CGAGCAATAA GTCCATGGAG ATCGAGGTGT TGGTGGACGC 900 

CGACCCTGTT GTGGACAGCT CTCAGAAGCG CTACCGGGCC GCCAGTGCCT TCTTCACCTA 960 

CGTGTCGCTG AGCCAGGAAG GCAGGTCGCT GCCTGTGCCC CAGCTGGTGC CCGAGACCGA 1020 

15 

GGACGAGAAG AAGCGCTTTG AGGAAGGCAA AGGGCGGTAC CTGCAGATGA AGGCGAAGCR 1080 

ACAGGGCCAC GCGGASCYTC AGCCCTAGAC TCCCTCCTCC TGCCACTGGT GCCTCGAGTA 1140 

20 GCCATGGCAA CGGGCCCAGT GTCCAGTCAC TTAGAAGTTC CCCCCTTGGC CAAAAACCCA 1200 

ATTCACATTG AGAGCTGGTG TTGTCTGAAG TTTTCGTATC ACAGTGTTAA CCTGTACTCT 1260 

CTCCTGCAAA CCTACACACC AAAGCTTTAT TTATATCATT CCAGTATCAA TGCTACACAG 1320 

25 

TGTTGTCCCG AGCGCCGGGA GGCGTTGGGC AGAAACCCTC GGGAATGCTT CCGAGCACGC 1380 

TGTAGGGTAT GGGAAGAACC CAGCACCACT AATAAAGCTG CTGCTTGGCT GGAAAAAAAA 1440 

30 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1500 

AGAAAAAAN 1509 



35 



(2) INFORMATION FOR SEQ ID NO: 180: 



(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1316 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 

AGCTGTATCA TAGGAAAGAT GGCCACACCG GCGGTACCAG TAAGTGCTCC TCCGGCCACG 60 

CCAACCCCAG TCCCGGCGGC GGCCCCAGCC TCAGTTCCAG CGCCAACGCC AGCACCGGCT 120 

50 

GCGGCTCCGG TTCCCGCTGC GGCTCCAGCC TGCATCCTCA GACCCTGCGG CAGCAGCGGC 180 

TGCAACTGCG GCTCCTGGCC AGACCCCGGC CTCAGCGCAA NTCCAGCGCA GACCCCAGCG 240 

55 CCCGCTCTGC CTGGTCCTGC TCTTCCAGGG CCCTTCCCCG GCGGCCGCGT GGTCAGGCTG 300 

CACCCAGTCA TTTTGGCCTC CATTGTGGAC AGCTACGAGA GACGCAACGA GGGTGCTGCC 360 

CGAGTTATCG GGACCCTGTT GGGAACTGTC GACAAACACT CAGTGGAGGT CACCAATTGC 420 

60 
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TTTTCAGTGC CGCACAATGA GTCAGAAGAT 
AATATGTATG AACTGCATAA AAAAGTTTCT 
5 ACGGGCCATG ACATCACAGA GCACTCTCTG 
CCCCAACCCC ATCCACCTCA CTGTGGACAC 
AGCCTACGTC AGCACTTTAA TGGGAGTCCC 

10 

TCTGACAGTG AAATACGCGT ACTACGACAC 
GACCTGCTTT AGCCCCAACA GAGTGATTGG 
15 GGCATCAGCT CGCATCCAGG ATGCCCTGAG 
GTCTGGAAAG GTGTCAGCTG ACAATACTGT 
AGTACCGAAA ATAGTTCCCG ATGACTTTGA 

20 

TTTGATGGTG ACCTACCTGG CCAACCTCAC 
TGTAAACCTG TGAATGGACC CCAAGCAGTA 
25 CTCAGAAGTG AAGGAGAAAT GGGTTTTTTG 
GTGTGTGACT CTAATAAACG GAGCCTACCT 
SGRGGGGGGG CCCGGTCCCA TTSSCCCTTT 

30 



GAAGTGGCTG TTGACATGGA ATTTGCTAAG 480 

CCAAATGAGC TCATCCTGGG CTGGTACGCT 540 

CTGNATCCAT GAGTACTACA GCCGAGAGGC 600 

AAGTCTCCAG AACGGCCGCA TGAGCATCAA 660 

TGGGAGGACC ATGGGAGTGA TGTTCACGCC 720 

TGAACGCATC GGAGTTGACC TGATCATGAA 780 

ACTCTCAAGT GACTTGCAGC AAGTAGGAGG 840 

TACAGTGTTG CAATATGCAG AGGATGTACT 900 

GGGCCGCTTC CTGATGAGCC TGGTTAACCA 960 

GACCATGCTC AACAGCAACA TCAATGACCT 1020 

ACAGTCACAG ATTGCACTCA ATGAAAAACT 1080 

CACTTGCTGG TCTAGGTATT AACCCCAGGA 1140 

TGGTCTTGAG TCACACTGAG ATAGTCAGTT 1200 

TTTGTAAATT AAAAAAAAAA AAAAAAACCN 1260 

NGTAATTCGT NTTACAATCC CCNGGC 1316 



35 



(2) INFORMATION FOR SEQ ID NO: 181: 



(i) SEQUENCE CHARACTERISTICS ; 

(A) LENGTH: 777 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
40 (D) TOPOLOGY: linear 



45 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 

GGCATGWKCA GACATGACTT CTATTGCCAG GCTGGTCAAG TGGCAGGGTC ATGAGGGAGA 60 

CATCGATAAG GGTGCTCCTT ATGCTCCCTG CTCTGGAATC CACCAGCGGG CTATCTGCGT 120 

TTATGGGGCT GGGGACTAGA ATTGGATGCT TCAAAACCAT CACCTGTTGG CCAACAAGTT 180 

50 TGACCCAAAG GTAGATGATA ATGCTCTTCA GTGCTTAGAA GAATACCTAC GTTATAAGGG 240 

CCATTCTATT GGGACCTGAA CTTTCAAGAC CACAMTATTG AAGAGGCGTT GCTTACCYGT 300 

TGGGGGCCAA GAGGCATGTT ACCAAACATG GYYCARGAAM YTTGGYKGGG AMCARKKKKG 360 

55 

GKKGGGARRM CMRGGGYTTG SCAAWTTCSK KGGCMWCCYT TTAGGGTAAR RRGGGCKGTW 420 

ATTAGATTGT GGGTAAAGTA GGATCTTTTG CCCTTGCAAA TTTGCTGCCT GGGTGAATGY 480 

60 TGCTTGTTCC TTCTCMACCC CTAACCCTAG TAGTTCCTCC ACTAACTTTC TCACTAAGTG 540 
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AGAATGAGAA CTGCTGTGAT AGGGAGAGTG AAGGAGGGAT ATGTGGTAGA GCACTTGATT 600 

TCAGTTGAAT GCCTGCTGGT AGCTTTTCCA TTCTGTGGAG CTGCCGTTCC TAATAATTCC 660 

5 

AGGTTTGGTA GCGTGGAGGA GAACTTTGAT GGAAAGAGAA CCTTCCCTTC TGTACTGTTA 720 

ACTTAAAAAT AAATAGCTCC TGATTCAAAG TAAAAAAAAA AAAAAAAAAA AAAAAAA 777 

10 



(2) INFORMATION FOR SEQ ID NO: 182: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 791 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 182: 
GGCACAGATA ACTATGTACA TGTATTCCTT AAATGTTTTT TTAAGTTTTA TATTCTTGGC 60 
25 ACTGGTCTTC AAATGTGTAC ATGTGTGCCA GGGAGCAAAT GCCTTCTTGT TTCTGAAATT 12 0 

GGTCTTTTAG ACTGTTCTTT TTTCCCATCT TCTCACCTCC TGCCCCTCCT TCAGGGTACT 180 
TCCGTGGCCA GAACCCCTCC AGGTCAGAGG CAGAAGAGAA GCCTCATGGG TCACAGCAGC 240 

30 

AGATGTGGGC TGGAGATCTA TTCATTTGGT TTTGGCTTGA ATTTTCTGRA TGGTTTACTT 300 
GATCYTGGGA AAGANATATC TTGCCAGGAA AAATGATAGN CCTTGACAAT GTTGAATGAT 360 
35 CCTGCACCAC CTTGAAAGAC ATTTCTAATA TGGTTTGTCA GGCAAAGTGG TTAGTAGTCA 420 
TTTGTGGCCT GAGGTAGAAG TCCTCAGAAA TCAGCAGACT TCACTGATAA AATGCTGACT 480 
TGCCCCTGGA CTGGGCTCTG TGAGAGTGGC CTTCTGCACT GTGCACAGTA GGTGTGAACA 540 

40 

CACCACACCT ACAGGGACCA CGTGGTGGGC TGTGGACTAG CGGCCAAGCT <XCTGCAGGC 600 
CCACTAATAG AATTCAGCTT TTAGCATGGG CTGTTTCATA CTGTTCTGAT GAAACTGATT 660 
45 TGGTTTCTTT CCTCCATACC CCTTCTGCAT TTCAGTGTTT TTGTTTAGTT TTCCTGGTTT 720 
TTAATTATAA CTACAAAATA AAATCTTTAG GCTATTCACC TTAGCTTAGT AAAAAAAAAA 780 
AAAAAAAACT C 791 

50 



55 



(2) INFORMATION FOR SEQ ID NO: 183: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1405 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
60 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

AAATTGATTA ACAGCTTGAA AGAAGGCIVT GGTTTTGAAG GCCTAGATAG CAGCACTGCC 60 

5 

AGTAGCATGG AGCTGGAAGA ACTTCGGCAT GAGAAAGAGA TGCAGAGGGA GGAAATACAG 120 

AAGCTGATGG GCCAGATACA TCAGCTCAGA TCCGAATTAC AGGATATGGA GGCACAGCAA 180 

10 GTTAATGAAG CAGAATCAGC AAGAGAACAG TTACAGGWTC TGCATGACCA AATAGCTGGG 24 C 



CAGAAAGCAT CCAAACAAGA ACTAGAGACA GAACTGGAGC GACTGAAGCA GGAGTTCCAC 300 



TATATAGAAG AAGATCTTTA TCGAACAAAG AACACA1TGC AAAGCAGAAT TAAAGATCGA 360 

15 

GACGAAGAAA TTCAAAAACT CAGGAATCAG CTTACCAATA AAACTTTAAG CAATAGCAGT 420 



CAGTCTGAGT TAGAAAATCG ACTCCATCAG CTAACAGAGA CTCTCATCCA GAAACAGACC 480 
20 ATGCTGGAGA GTCTCAGCAC AGAAAAGAAC TCCCTGGTCT TTCAACTGGA GCGCCTCGAA 540 
CAGCAGATGA ACTCCGCCTC TGGAAGTAGT AGTAATGGGT CTTCGATTAA TATGTCTGGA 600 



ATTGACAATG GTGAAGGCAC TCGTCTGCGA AATGTTCCTG TTCTTTTTAA TGACACAGAA 660 

25 

ACTAATCTGG CAGGAATGTA CGGAAAAGTT CGCAAAGCTG CTAGTTCAAT TGATCAGTTT 720 

AGTATTCGCC TGGGAATTTT TCTCCGAAGA TACCCCATAG CGCGAGTTTT TGTAATTATA 780 

30 TATATGGCTT TGCTTCACCT CTGGGTCATG ATTGTTCTGT TGACTTACAC AC C AGAAATG B40 

CACCACGACC AACCATATGG CAAATGAACC AAGCCCAGTT GTTGCAGTGA TTGGTTGTCT 900 

TTTTCTAGAC TTGGGATCTG CAAGAAGGCC AATTGCCTAA AATTTCTGAG AACAGTGCAC 960 

35 

AAGATTATTT TATCACTACA AGCTTTTAAC TTTTTAAGTT ATTGTACAAG TATTCTACCT 1020 

AAATCTTCCA ATTTCCTTTA AATGGTAAGA GTTTCTAAAA CAGACAATAA TTTAACAAGC 1080 

40 TCAGCTCTGC TTTATCTGAG TTTAGTGGTC CTAATATATA TGTAGAGAAA GATGGTGGGG 1140 



TTGTTCACCT CTGTACAGAC CATCTGTATG TTAGGTGACA TTGATTATGG GTTATAATCA 1200 

GGGAAACTAA TTGTATTTAG TGACAAAAAT AAAAAGTTTT TTTTTTATAA TTCAGTCTGC 1260 

45 

TTTTGGATTT TCATATATTT AACTTTGCAA AAAGATTTAC TTTGTACATG TTACAGGCTT 1320 



GATTGGTGTA AATCTTTTTA TAAATACATA AATAAAAGNA AAATATGCAT TTTTCTTTTC 1380 
50 TAAAAAAAAA AAAAAAAAAA CTCGA 1405 



55 (2) INFORMATION FOR SEQ ID NO: 184: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1596 base pairs 
<B) TYPE: nucleic acid 
60 <C) STRANDEDNESS : double 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

GTCATGCAGT GCGCCGGAGA ACTGTGCTCT TTGAGGCCGA CGCTAGGGGC CCGGAAGGGA 60 

AACTGCGAGG CGAAGGTGAC CGGGGACCGA GCATTTCAGA TCTGCTCGGT AGACCTGGTG 120 

CACCACCACC ATGTTGGCTG CAAGGCTGGT GTGTCTCCGG ACACTACCTT CTAGGGTTTT 180 

CCACCCAGCT TTCACCAAGG CCTCCCCTGT TGTGAAGAAT TCCATCACGA AGAATCAATG 240 

GCTGTTAACA CCTAGCAGGG AATATGCCAC CAAAACAAGA ATTGGGATCC GGCGTGGGAG 300 

15 AACTGGCCAA GAACTCAAAG AGGCAGCATT GGAACCATCG ATGGAAAAAA TATTTAAAAT 360 

TGATCAGATG GGAAGATGGT TTGTTGCTGG AGGGGCTGCT GTTGGTCTTG GAGCATTGTG 420 

CTACTATGGC TTGGGACTGT CTAATGAGAT TGGAGCTATT GAAAAGGCTG TAATTTGGCC 480 

20 

TCAGTATGTC AAGGATAGAA TTCATTCCAC CTATATGTAC TTAGCAGGGA GTATTGGTTT 540 

AACAGCTTTG TCTGCCATAG CAATCAGCAG AACGCCTGTT CTCATGAACT TCATGATGAG 600 

25 AGGCTCTTGG GTGACAATTG GTGTGACCTT TGCA3CCATG GTTGGAGCTG GAATGCTGGT 660 

ACGATCAATA CCATATGACC AGAGCCCAGG CCCAAAGCAT CTTGCTTGGT TGCTACATTC 720 

TGGTGTGATG GGTGCAGTGG TGGCTCCTCT GACAATATTA GGGGGTCCTC TTCTCATCAG 780 

30 

AGCTGCATGG TACACAGCTG GCATTGTGGG AGGCCTCTCC ACTGTGGCCA TGTGTGCGCC 840 

CAGTGAAAAG TTTCTGAACA TGGGTGCACC CCTGGGAGTG GGCCTGGGTC TCGTCTTTGT 900 

35 GTCCTCATTG GGATCTATGT TTCTTCCACC TACCACCGTG GCTGGTGCCA CTCTTTACTC 960 

AGTGGCAATG TACGGTGGAT TAGTTCTTTT CAGCATGTTC CTTCTGTATG ATACCCAGAA 1020 

AGTAATCAAG CGTGCAGAAG TATCACCAAT GTATGGAGTT CAAAAATATG ATCCCATTAA 1080 

40 

CTCGATGCTG AGTATCTACA TGGATACATT AAATATATTT ATGCGAGTTG CAACTATGCT 1140 

GGCAACTGGA GGCAACAGAA AGAAATGAAG TGACTCAGCT TCTGGCTTCT CTGCTACATC 1200 

45 AAATATCTTG TTTAATGGGG CAGATATGCA TTAAATAGTT TGTACAAGCA GCTTTCGTTG 1260 

AAGTTTAGAA GATAAGAAAC ATGTCATCAT ATTTAAATGT TCCGGTAATG TGATGCCTCA 1320 

GGTCTGCCTT TTTTTCTGGA GAATAAATGC AGTAATCCTC TCCCAAATAA GCACACACAT 1380 

50 

TTTCAATTCT CATGTTTGAG TGATTTTAAA ATGTTTTGGT GAATGTGAAA ACTAAAGTTT 1440 

GTGTCATGAG AATGTAAGTC TTTTTTCTAC TTTAAAATTT AGTAGGTTCA CTGAGTAACT 1500 

55 AAAATTTAGC AAACCTGTGT TTGCATATTT TTTKGGAGTG CAGMMTAWTG TAATTARAGC 1560 

ATTCCAGTAA NAGTGTNTTT AAAGTTGNTC TATATN 1596 

60 
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(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 2293 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

10 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

GCGCAGAGCC CGYACGAGCA GGACGACGAC GACAAGGGCG ACTCCAAGGA AACGCGGCTG 60 

ACCCTGATGG AGGAAGTGCT CCTGCTGGGC CTCAAGGACC GCGARGGTTA CACATCATTT 120 

15 

TGGAATGACT GTATATCATC TGGATTACGT GGCTGTATGT TAATTGAATT AGCATTGAGA 180 

GGAAGGTTAC AACTAGAGGC TTGTGGAATG AGACGTAAAA GTCTATTAAC AAGAAAGGTA 240 

20 ATCTGTAAGT CAGATGCTCC AACAGGGGAT GTTCTTCTTG ATGAAGCTCT GAAGCATG1T 300 

AAGGAAACTC AGCCTCCAGA AACGGTCCAG AACTGGATTG AATTACTTAG TGGTGAGACA 360 

TGGAATCCAT TAAAATTGCA TTATCAGTTA AGAAATGTAC GGGAACGATT AGCTAAAAAC 420 

25 

CTGGTGGAAA AGGGTGTATT GACAACAGAG AAACAGAACT TCCTACTTTT TGACATGACA 480 

ACACATCCCC TCACCAATAA CAACATTAAG CAGCGCCTCA TCAAGAAAGT ACAGGAAGCC 540 

30 GTTCTTGACA AATGGGTGAA TGACCCTCAC CGCATGGACA GGCGCTTGCT GGCCCTCATT 600 

TACCTGGCTC ATGCCTCGGA CGTCCTGGAG AATGCTTTTG CTCCTCTTCT GGACGAGCAG 660 

TATGATTTGG CTACCAAGAG AGTGCGGCAG CTTCTCGACT TAGACCCTGA AGTGGAATGT 720 

35 

CTGAAGGCCA ACACCAATGA GGTTCTGTGG GCGGTGGTGG CGGCGTTCAC CAAGTAACTC 780 

TGCTCGGGGT GAACCATTCT CCTTTCTCTC AAGTAAACCA GTAGTTTTTC TTCTGTTGAC 840 

40 TTCTGGTTTT CTGTAATTTG TACTTTCCCA CACTATAATT GGCTTCTGTT TTACAAAATG 900 

GTGGGTGGCT TTTTCTTTTT TGTACGTGTA CAGGATTCTG CTGGTACGAG AGGCCTTCCT 960 

CTTTCTCTTT TTAAAAAAAG TTTTACTGCC ATATTGGCAT TCCATTCCCT GTTGCCATCC 1020 

45 

TCACTGTTAC CTGTTTTGGG TTTCTGGTCT ACTTTGACTT TCAAAGTACC TCCAGCCTCC 1080 

TCATACGCAC AGCTTTTGGA TGACCTCAGC TTGAGTTTCT CCATATGTGC ATGTACATCT 1140 

50 AGCATTCTGC CTACAGTTCA GACAGAAGTC ACAAAAAGGC CTTCAACTCA CCAAAGGTAA 1200 

ATATCTGTAT CTATTAGGAC ATTTTTTACA TAGACTTCAG TTGAGATGTA TACTTAGCAA 1260 

AATTATTTTT AAATTGAAAC AGCACAGTAA ATACTTAATA TAAAATGTCC CTTGGATTTT 1320 

55 

GCTTCCCATG TAAATCTATT GTATTATTAC ACTTGTTATA ATTTTAACTA TAAAGGTCCA 1380 

ATTGTTTCAC AGAGCCAGTT TGGGATGGGC TGCATTCCAT TTATGCTGTA TATAGTTTGA 1440 

60 ATTATATATA AATTACCCCT TCTTCTGGCC ACCCCTGCTC CCATCTTAGT ATTTTGCAAG 1500 
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ATCTAATCAG TTGTACACCT GGTGCCCCTC GCTTGCTTCA ATCATGGTTA TTTGATGGCA 1560 

^ AAATCGACCT CTTGTCGCTG AAGGAGAGAG AAAAGATGTG TGTCTGATTG GTCCTGOGAT 1620 

TTTTTSAGCT GTGCCATTTA TGGTACTCTT TGCCTATGCA TCCCCTTTTT AGATTTTTTT 1680 

TAAATTTTAT CTTACTGTTT TTATAATTTC TATTGGGAAG AGGCTTGTGA CCAGTACCAA 1740 

10 TCTTGAGTTT CTTTTTCTGT CCACAAGTAA ATTAATATCT GCTCTGAAAT GTCATTTATC 1800 

TACTCACACA TTCTTGGGGA AAAAAATCAA ATGTCAGTCC TAGCAGATGT TGCATGTAAA 1860 

^ TTGGTAGCAA GTAATGATTA CAACCCAGAG GATTAAGAAT TTTGTAACAG AAAGCTCTAT 1920 

GTTTTAATTT TTTATATACA ATTAGGATAA TTAGCATTGT CAGACTATAA ACCTTTGCTT 1980 

TTTAAAGTTT ATTTTTACTA TTTCTTTATC ACTTTATTGT ATCATCACCA TTGGTTTCAT 2040 



25 



30 



2100 



20 AATGTAAATA CTATATGTTG AACAAATTAA ATGTCAAAAT TTTTTATTAC CATAGTCCAT 

GTTAATAGTG GGGCTTTCAG GTGTTTAGAG ATTTTTTTTG TTGTTGTTAA CATTCATTGC 2160 
AAAAGTACTA GATGGTGTAT AACTCTAGAG TTGAATTTTA AGGGATTCCC TAATATGTAT 2220 
ACTATCTTTT TATCTGAAGT AATAAATAAA CAATGATCTT GAAAGTGCCY RAAAMAAAAA 2280 
AAAAAAAAAA AAA 



2293 



(2) INFORMATION FOR SEQ ID NO: 186: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1212 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : double 

(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 
GGCACGAGGC GAGCCGGCGC ACCGTACGCT GGGACGTGTG GTTTCAGCTC GTGCGCCTCC 60 
45 CCGTGGGTTT GCGACGTTTA GCGACTATTG CGCCTGCGCC ACGCCGGCTG CGAGACTGGG 120 
GCCGTGGCTG CTGGTCCCGG GTGATGCTAG GCGGCTCCCT GGGCTCCAGG CTGTTGCGGG 180 
GTGTAGGTGG GAGTCACGGA CGGTTCGGGG CCCGAGGTGT CCGCGAAGGT GGCGCACATG 240 

50 

GGCGGCAGGG GAGAGCATGG CTCAGCGGAT GGTCTGGGTG GACCTGGAGA TGACAGGATT 300 
GGACATTGAG AAGGACCAGA TTA1TGAGAT GGCCTGTCTG ATAACTGACT CTGATCTCAA 360 
55 CATTTTGGCT GAAGGTCCTA ACCTGATTAT AAAACAACCA GATGAGTTGC TGGACAGCAT 420 
GTCAGATTGG TGTAAGGAGC ATCACGGGAA GTCTGGCCTT ACCAAGGCAG TGAAGGAGAG 480 
TACAATTACA TTGCAGCAGG CAGAGTATGA ATTTCTGTCC TTTGTACGAC AGCAGACTCC 540 

60 
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TCCAGGGCTC TGTCCACTTG CAGGAAATTC AGTTCATGAA GATAAGAAGT TTCTTGACAA 600 

ATACATGCCC CAGTTCATGA AACATCTTCA TTATAGAATA ATTGATGTGA GCACTGTTAA 660 

AGAACTGTGC AGACGCTGGT ATCCAGAAGA ATATGAATTT GCACCAAAGA AGGCTGCTTC 720 

TCATAGOGCA CTTGATGACA TTAGTGAAAG CATCAAAGAG CTTCAGTTTT ACCGAAATAA 780 

CATCTTCAAG AAAAAAATAG ATGAAAAGAA GAGGAAAATT ATAGAAAATG GGGAAAATGA 840 

GAAGACCGTG AGTTGATGCC AGTTATCATG CTGCCACTAC ATCGTTATCT GGAGGCAACT 900 

TCTGGTGGTT TTTTTTTCTC ACGCTGATGG CTTGGCAGAG CACCTTCGGT TAACTTGCAT 960 

15 CTCCAGATTG ATTACTCAAG CAGACAGCAC ACGAAATACT ATTTTTCTCC TAATATGCTG 1020 

TTTCCATTAT GACACAGCA3 CTCCTTTGTA AGTACCAGGT CATGTCCATC CCTTGGTACA 1080 

TATATGCATT TGCTTTTAAA CCATTTCTTT TGTTTAAATA AATAAATAAG TAAATAAAGC 1140 

20 

TAGTTCTATT GAAATGCAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1200 

AAAAAAAAAA AN 1212 

25 

(2) INFORMATION FOR SEQ ID NO: 187; 

30 (i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 1605 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

GCTTCCGGAA GTTGCTTTTG TCCAAACATC CGGGCTTCTC CTTTTTGTGT TCCGGCCGAT 60 

40 CCCACCTCTC CTCGACCCTG GACGTCTACC TTCCGGAGGC CCACATCTTG CCCACTCCGC 120 

GCGCGGGGCT AGCGCGGGTT TCAGCGACGG GAGCCCTCAA GGGACATGGC AACTACAGCG 180 

GCGCCGGCGG GOGGCGCCCG AAATGGAGCT GGCCCGGAAT GGGGAGGGTT CGAAGAAAAC 240 

45 

ATCCAGGGCG GAGGCTCAGC TGTGATTGAC ATGGAGAACA TGGATGATAC CTCAGGCTCT 300 

AGCTTCGAGG ATATGGGTGA GCTGCATCAG CGCCTGCGCG AGGAAGAAGT AGACGCTGAT 360 

50 GCAGCTGATG CAGCTGCTGC TGAAGAGGAG GATGGAGAGT TCCTGGGCAT GAAGGGCTTT 420 

AAGGGACAGC TGAGCCGGCA GGTGGCAGAT CAGATGTGGC AGGCTGGGAA AAGACAAGCC 480 

TCCAGGGCCT TCAGCTTGTA CGCCAACATC GACATCCTCA GACCCTACTT TGATGTGGAG 540 

55 

CCTGCTCAGG TGCGAACAGG GCTCCTGGAG TCCATGATCC CTATCAAGAT GGTCAACTTC 600 

CCCCAGAAAA TTGCAGGTGA ACTCTATGGA CCTCTCATGC TGGTCTTCAC TCTGGTTGCT 660 

60 ATCCTACTCC ATGGGATGAA GACGTCTGAC ACTATTATCC GGGAGGGCAC CCTGATGGGC 720 
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ACAGCCATTG GCACCTGCTT CGGCTACTGG CTGGGAGTCT CATCCTTCAT TTACTTCCTT 780 

GCCTACCTGT GCAACGCCCA GATCACCATG CTGCAGATGT TGGCACTGCT GGGCTATGGC 840 

5 

CTCTTTGGGC ATTGCATTGT CCTGTTCATC ACCTATAATA TCCACCTCCA CGCCCTCITC 900 

TACCTCTTCT GGCTGTTGGT GGGTGGACTG TCCACACTGC GCATGGTAGC AGTGTTGGTG 960 

10 TCTCGGACCG TGGGCCCCAC ACAGCGGCTG CTCCTCTGTG GCACCCTGGC TGCCCTACAC 1020 

ATGCTCTTCC TGCTCTATCT GCATTTTGCC TACCACAAAG TGGTAGAGGG GATCCTGGAC 1080 

ACACTGGAGG GCCCCAACAT CCCGCCCATC CAGAGGGTCC CCAGAGACAT CCCTGCCATG 1140 

15 

CTCCCTGCTG CTCGGCTTCC CACCACCGTC CTCAACGCCA CAGCCAAAGC TGTTGCGGTG 1200 

ACCCTGCAGT CACACTGACC CCACCTGAAA TTCTTGGCCA GTCCTCTTTC CXTGCAGCTGC 1260 

20 AGAGAGGAGG AAGACTATTA AAGGACAGTC CTGATGACAT GTTTCGTAGA TGGGGTTTGC 1320 

AGCTGCCACT GAGCTGTAGC TGCGTAAGTA CCTCCTTGAT GCNTGTCGGC ACTTCTGAAA 1380 

GGCACAAGGC CAAGAACTCC TGGCCAGGAC TGCAAGGCTC TGCAGCCAAT GCAGAAAATG 1440 

25 

GGTCAGCTCC TTTGAGAACC CCTCCCCACC TACCCCTTCC TTCCTCTTTA TCTCTCCCAC 1500 

ATTGTCTTGC TAAATATAGA CTTGGTAATT AAAATGTTGA TTGAAGTCTG GAAAAAAAAA 1560 

30 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAC TCGAG 1605 



35 (2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1516 base pairs 

(B) TYPE: nucleic acid 
40 <C) STRANDEENESS: double 

<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

45 ATTCGGCATG AGGGGGTCAC GTGGTGGCTG GGCCGGGGAA ATGGCGGCTT CAGGAGAGAG 60 

CGGGACTTCA GGCGGCGGAG GCAGCACCGA GGAAGCATTT ATGACCTTCT ACAGTGAGGT 120 

GAAACAAATA GAGAAGAGAG ACTCGGTTCT AACTTCGAAA AATCAGATTG AAAGACTGAC 180 

50 

CCGTCCTGGT TCCTCTTACT TCAATTTGAA CCCATTTGAG GTTCTTCAGA TAGATCCTGA 240 

AGTTACAGAT GAAGAAATAA AAAAGAGGTT TCGGCAGTTA TCCATCTTGG TGCATCCTGA 300 

55 CAAAAATCAA GATGATGCTG ACAGAGCACA AAAGGCTTTT GAAGCTGTGG ACAAAGCTTA 360 

CAAGTTGCTA CTGGATCAGG AGCAAAAGAA GAGGGCCCTG GATGTAATTC AGGCAGGAAA 420 

AGAATACGTG GAACACACTG TGAAAGAGCG AAAAAAACAA TTAAAGAAGG AAGGAAAACC 480 

60 
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TACAATTGTA GAGGAGGATG ATCCTGACCT GTTCAAACAA GCTGTATATA AACAGACAAT 540 

GAAACTCTTT GCAGAGCTGG AAATTAAAAG GAAAGAGAGA GAAGCCAAAG AGATGCATGA 600 

AAGGAAACGA CAAAGGGAAG AAGAGATTGA AGCTCAAGAA AAAGCCAAAC GGGAAAGAGA 660 

GTGGCAGAAA AACTTTGAGG AAAGTCGAGA TGGTCGTGTG GACAGCTGGC GAAACTTCCA 720 
AGCCAATACG AAGGGGAAGA AAGAGAAGAA AAATCGGACC 1TCCTGAGAC CACCGAAAGT 



10 AAAAATGGAG CAACGTGAGT GACCGCCCAA GGTCACAGGC ACAGAACCTT TCCCCTGCTA 



TCTGCATTTG TTAGTGGTAA TATTCTTTAT GTATAATAAA TTTTTATACC CAAAAAAAAA 
35 AAAAAAAAAA ACTCGA 



780 
840 
900 
960 



TCTCCCTTCC TGCTTCGAAG GACTCATTCT TTCCTCCCAC TTCCACCCCA ACATAGAGTA 
15 GTATTTGCTT TTTAGTCCAT TTTGTTTTCA ATACGATTTA ATATCGATCA GAGTAATTCT 

TTTGTACATT GAAATGAGGG GCTTGGTTTA AAAAAAGACC TTTCCCTCTC CCTGCCCCTA 1020 
GAACAACCAG TATTAGAAGG TGCCACCATT GGTGCTGCCT TCTCTTCCCA CAGCCTGTAA 1080 

20 

CTCAGTGTTT TGTACTTCAC TGAATTGTGA TGGTTAGAAA CTTCGTGGAT AGTTTTGTGGA 



1140 
1200 



AATCATCCAA TTAAACATAC TGCTTAAAAC AGTGTTGCTG TGACTTCAGA GACAAGCCTG 

25 GAAGGGGCAC CTTAGGAAGC CCCTTCGCTT CAGTTGCTCG CTTCTGGGTG TGCTCCCTTC 1260 

GAAGGCCCAG ATAAGACAGG GAACACTTGT GAGCACACAG AGCAGCATCT GATGCCCTGT 1320 

GGTGTTTGGC ATGTGCCCCC TGTCTACTGA CCAATCAGTG TGGCATGAGG CCCACGCCAC 1380 

30 

CCAAACCTTT CACTTTCCAA AGAGCTAGCC GTCCTCCACC CAGTACCATG TCCTAGCCTG 1440 



1500 
1516 



40 (2) INFORMATION FOR SEQ ID NO: 189: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 681 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

50 GCTCCCATGT TGCTGGCTGT CCGTACATCA CCCTGTCCCC TGCAGGAGGG GGCTACAGGC 60 

CATCTCCCTC CTGTAGGCCT CTGACTCCCC TCCACTTTTG GGCCCTCAGC TTATCTCGGG 120 

CAGGGGACCA TTGCAGCATC CTCCCCTCCT CNGGACTCAA GGTGCTGAGG TATAAGCCCT 180 

GGGCCCCAGA TCCCTGRTKA CACCTTCCTG GAGAAGACTC TCAAAAGTGA CTGTATATTT 240 

GAGTTCACCA GCAATAACTC CCCACACTCG AAGCAGGTCC AAACCCMAGG ATCCCAGGGT 300 

60 CCTTGGGCTC TGTGGCACTG TCTTCCCAAG ATCCTTCCTG TTGCACAATG GGAAACCTAA 360 



55 



